AWS Specific code
Installing libraries
!pip install tensorflow opencv-python pillow scikit-learn
Requirement already satisfied: tensorflow in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (2.16.2) Requirement already satisfied: opencv-python in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (4.11.0.86) Requirement already satisfied: pillow in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (11.1.0) Requirement already satisfied: scikit-learn in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (1.6.1) Requirement already satisfied: absl-py>=1.0.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (2.1.0) Requirement already satisfied: astunparse>=1.6.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (1.6.3) Requirement already satisfied: flatbuffers>=23.5.26 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (25.2.10) Requirement already satisfied: gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (0.6.0) Requirement already satisfied: google-pasta>=0.1.1 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (0.2.0) Requirement already satisfied: h5py>=3.10.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (3.12.1) Requirement already satisfied: libclang>=13.0.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (18.1.1) Requirement already satisfied: ml-dtypes~=0.3.1 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (0.3.2) Requirement already satisfied: opt-einsum>=2.3.2 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (3.4.0) Requirement already satisfied: packaging in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (21.3) Requirement already satisfied: protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (4.25.6) Requirement already satisfied: requests<3,>=2.21.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (2.32.3) Requirement already satisfied: setuptools in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (75.8.0) Requirement already satisfied: six>=1.12.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (1.17.0) Requirement already satisfied: termcolor>=1.1.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (2.5.0) Requirement already satisfied: typing-extensions>=3.6.6 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (4.12.2) Requirement already satisfied: wrapt>=1.11.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (1.17.2) Requirement already satisfied: grpcio<2.0,>=1.24.3 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (1.70.0) Requirement already satisfied: tensorboard<2.17,>=2.16 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (2.16.2) Requirement already satisfied: keras>=3.0.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (3.8.0) Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (0.37.1) Requirement already satisfied: numpy<2.0.0,>=1.23.5 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorflow) (1.26.4) Requirement already satisfied: scipy>=1.6.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from scikit-learn) (1.15.1) Requirement already satisfied: joblib>=1.2.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from scikit-learn) (1.4.2) Requirement already satisfied: threadpoolctl>=3.1.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from scikit-learn) (3.5.0) Requirement already satisfied: wheel<1.0,>=0.23.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from astunparse>=1.6.0->tensorflow) (0.45.1) Requirement already satisfied: rich in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from keras>=3.0.0->tensorflow) (13.9.4) Requirement already satisfied: namex in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from keras>=3.0.0->tensorflow) (0.0.8) Requirement already satisfied: optree in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from keras>=3.0.0->tensorflow) (0.14.0) Requirement already satisfied: charset_normalizer<4,>=2 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from requests<3,>=2.21.0->tensorflow) (3.4.1) Requirement already satisfied: idna<4,>=2.5 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from requests<3,>=2.21.0->tensorflow) (3.10) Requirement already satisfied: urllib3<3,>=1.21.1 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from requests<3,>=2.21.0->tensorflow) (1.26.19) Requirement already satisfied: certifi>=2017.4.17 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from requests<3,>=2.21.0->tensorflow) (2025.1.31) Requirement already satisfied: markdown>=2.6.8 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorboard<2.17,>=2.16->tensorflow) (3.7) Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorboard<2.17,>=2.16->tensorflow) (0.7.2) Requirement already satisfied: werkzeug>=1.0.1 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from tensorboard<2.17,>=2.16->tensorflow) (3.1.3) Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from packaging->tensorflow) (3.2.1) Requirement already satisfied: MarkupSafe>=2.1.1 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from werkzeug>=1.0.1->tensorboard<2.17,>=2.16->tensorflow) (3.0.2) Requirement already satisfied: markdown-it-py>=2.2.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from rich->keras>=3.0.0->tensorflow) (3.0.0) Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from rich->keras>=3.0.0->tensorflow) (2.19.1) Requirement already satisfied: mdurl~=0.1 in /home/ec2-user/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages (from markdown-it-py>=2.2.0->rich->keras>=3.0.0->tensorflow) (0.1.2)
import os
os.environ["TF_ENABLE_ONEDNN_OPTS"] = "0"
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
!nvidia-smi
Mon Mar 24 13:08:58 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.144.03 Driver Version: 550.144.03 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA A10G On | 00000000:00:1E.0 Off | 0 |
| 0% 31C P8 16W / 300W | 1MiB / 23028MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
import boto3
def download_files_from_bucket(file,bucket):
'''
this function is for downloading the files from the bucket to the local instance
'''
bucket_name = bucket
file_key = file
local_file_path = file
s3 = boto3.client('s3')
s3.download_file(bucket_name, file_key, local_file_path)
print(f"File downloaded to {local_file_path}")
download_files_from_bucket('stanford-car-dataset-by-classes-folder.zip','pgp-capstone-project')
File downloaded to stanford-car-dataset-by-classes-folder.zip
zip_file_path = 'stanford-car-dataset-by-classes-folder.zip'
!unzip -oq stanford-car-dataset-by-classes-folder.zip
Computer vision can be used to automate supervision and generate action appropriate action trigger if the event is predicted from the image of interest. For example a car moving on the road can be easily identified by a camera as make of the car, type, colour, number plates etc.
Design a DL based car identification model.
The Cars dataset contains 16,185 images of 196 classes of cars. The data is split into 8,144 training images and 8,041 testing images, where each class has been split roughly in a 50-50 split. Classes are typically at the level of Make, Model, Year, e.g. 2012 Tesla Model S or 2012 BMW M3 coupe.
Data description:
‣ Train Images: Consists of real images of cars as per the make and year of the car.
‣ Test Images: Consists of real images of cars as per the make and year of the car.
‣ Train Annotation: Consists of bounding box region for training images.
‣ Test Annotation: Consists of bounding box region for testing images.
import os
import zipfile
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt #for visualization
import matplotlib.patches as patches
import seaborn as sns
from PIL import Image # For image loading and manipulation
from pathlib import Path
import cv2
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, accuracy_score
from sklearn.metrics import confusion_matrix
from sklearn.preprocessing import LabelEncoder
from sklearn.utils.class_weight import compute_class_weight
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, GlobalAveragePooling2D, BatchNormalization
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from sklearn.utils import class_weight
from tensorflow.keras.applications.resnet50 import preprocess_input as resnet_preprocess
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input as mobilenet_preprocess
from keras.applications.inception_v3 import preprocess_input as googlenet_preprocess
from tensorflow.keras.applications import MobileNetV2
from tensorflow.keras.applications import InceptionV3
from tensorflow.keras.applications import ResNet50
Matplotlib is building the font cache; this may take a moment.
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True) # Allow dynamic allocation
except RuntimeError as e:
print(e)
4A. Data Handling - Import Data
train_annotations_df = pd.read_csv( "anno_train.csv",header=None)
test_annotations_df = pd.read_csv( "anno_test.csv",header=None)
image_class_df = pd.read_csv( "names.csv",header=None)
train_annotations_df.rename(columns={0:"image_name",1:"xmin",2:"ymin",3:'xmax',4:'ymax',5:'image_class'},inplace=True)
test_annotations_df.rename(columns={0:"image_name",1:"xmin",2:"ymin",3:'xmax',4:'ymax',5:'image_class'},inplace=True)
image_class_df.rename(columns={0:'image_name'},inplace=True)
train_annotations_df.head(5)
| image_name | xmin | ymin | xmax | ymax | image_class | |
|---|---|---|---|---|---|---|
| 0 | 00001.jpg | 39 | 116 | 569 | 375 | 14 |
| 1 | 00002.jpg | 36 | 116 | 868 | 587 | 3 |
| 2 | 00003.jpg | 85 | 109 | 601 | 381 | 91 |
| 3 | 00004.jpg | 621 | 393 | 1484 | 1096 | 134 |
| 4 | 00005.jpg | 14 | 36 | 133 | 99 | 106 |
test_annotations_df.head()
| image_name | xmin | ymin | xmax | ymax | image_class | |
|---|---|---|---|---|---|---|
| 0 | 00001.jpg | 30 | 52 | 246 | 147 | 181 |
| 1 | 00002.jpg | 100 | 19 | 576 | 203 | 103 |
| 2 | 00003.jpg | 51 | 105 | 968 | 659 | 145 |
| 3 | 00004.jpg | 67 | 84 | 581 | 407 | 187 |
| 4 | 00005.jpg | 140 | 151 | 593 | 339 | 185 |
# for images
base_dir = Path(r"./car_data") #replace the directory accordingly
train_images_path = base_dir / "car_data" / "train"
test_images_path = base_dir / "car_data" / "test"
train_images_path = Path(train_images_path).resolve()
test_images_path = Path(test_images_path).resolve()
print(f"train image path is {train_images_path}")
print(f"test image path is {test_images_path}")
train image path is /home/ec2-user/SageMaker/car_data/car_data/train test image path is /home/ec2-user/SageMaker/car_data/car_data/test
4B. Data Handling - Map Images w.r.t Classes
#Train Images class mapping
#Folder where multiple train images are stored
train_class_folders = [f.path for f in os.scandir(train_images_path) if f.is_dir()]
train_image_classes = {} # Dictionary to store training image: class mapping
train_images_path = list(train_images_path.rglob("*.jpg"))
# Create a dictionary mapping image filenames to class names (parent folder)
train_image_classes = {img_path.name: img_path.parent.name for img_path in train_images_path}
# Define columns for the Training DataFrame
columns_training = ['Image_Path', 'labels']
# Create an empty DataFrame
df_training = pd.DataFrame(columns=columns_training)
df_training = pd.DataFrame(train_images_path, columns=["Image_Path"])
df_training["labels"] = df_training["Image_Path"].apply(lambda x: Path(x).parent.name)
df_training["Image_Path"] = df_training["Image_Path"].apply(lambda x: str(Path(x).resolve()))
df_training["Image_Path"] = df_training["Image_Path"].astype(str)
print(df_training.head(10))
# --- Print a few mappings to verify ---
print("Sample Training Image to Class Mappings:")
count = 0
for img_name, class_label in list(train_image_classes.items())[:5]:
print(f"{img_name}: {class_label}")
Image_Path labels 0 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 1 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 2 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 3 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 4 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 5 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 6 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 7 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 8 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 9 /home/ec2-user/SageMaker/car_data/car_data/tra... Infiniti QX56 SUV 2011 Sample Training Image to Class Mappings: 05829.jpg: Infiniti QX56 SUV 2011 04532.jpg: Infiniti QX56 SUV 2011 04524.jpg: Infiniti QX56 SUV 2011 04856.jpg: Infiniti QX56 SUV 2011 02413.jpg: Infiniti QX56 SUV 2011
#Test Images class mapping
#Folder where multiple test images are stored
test_class_folders = [f.path for f in os.scandir(test_images_path) if f.is_dir()]
test_image_classes = {} # Dictionary to store testing image: class mapping
test_images_path_root = test_images_path.resolve()
test_images_path_list = list(test_images_path_root.rglob("*.jpg"))
# Create a dictionary mapping image filenames to class names (parent folder)
test_image_classes = {img_path.name: img_path.parent.name for img_path in test_images_path_list}
# Define columns for the Testing DataFrame
columns_testing = ['Image_Path', 'labels']
# Create an empty DataFrame
df_testing = pd.DataFrame(columns=columns_testing)
df_testing = pd.DataFrame(test_images_path_list, columns=["Image_Path"])
df_testing["labels"] = df_testing["Image_Path"].apply(lambda x: Path(x).parent.name)
df_testing["Image_Path"] = df_testing["Image_Path"].apply(lambda x: str(Path(x).resolve()))
df_testing["Image_Path"] = df_testing["Image_Path"].astype(str)
print(df_testing.head(10))
print("Sample Testing Image to Class Mappings:")
count = 0
for img_name, class_label in list(test_image_classes.items())[:5]:
print(f"{img_name}: {class_label}")
Image_Path labels 0 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 1 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 2 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 3 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 4 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 5 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 6 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 7 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 8 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 9 /home/ec2-user/SageMaker/car_data/car_data/tes... Infiniti QX56 SUV 2011 Sample Testing Image to Class Mappings: 01068.jpg: Infiniti QX56 SUV 2011 02434.jpg: Infiniti QX56 SUV 2011 02499.jpg: Infiniti QX56 SUV 2011 04803.jpg: Infiniti QX56 SUV 2011 00478.jpg: Infiniti QX56 SUV 2011
4C. Data Handling - Map Images w.r.t Annotations
train_annotations_df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 8144 entries, 0 to 8143 Data columns (total 6 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 image_name 8144 non-null object 1 xmin 8144 non-null int64 2 ymin 8144 non-null int64 3 xmax 8144 non-null int64 4 ymax 8144 non-null int64 5 image_class 8144 non-null int64 dtypes: int64(5), object(1) memory usage: 381.9+ KB
# ********Definition of the method ********************************
def map_images_to_bboxes(annotations_file):
image_bboxes = {}
try:
for index, row in annotations_file.iterrows():
image_name = row['image_name']
x_min = row['xmin']
y_min = row['ymin']
x_max = row['xmax']
y_max = row['ymax']
image_class = row['image_class']
image_bboxes[image_name] = (x_min, y_min, x_max, y_max) # Store bbox as tuple
except FileNotFoundError:
print(f"Error: Annotation file not found: {annotations_file}")
except KeyError as e:
print(f"Error: Column '{e}' not found in CSV file. Check your CSV column names.")
print("Expected columns (example): filename, xmin, ymin, xmax, ymax") # Example expected columns
return image_bboxes
#Train images boundry box mapping
train_image_bboxes = map_images_to_bboxes(train_annotations_df)
# --- Print a few mappings to verify for Training images ---
print("\nSample Training Image to Bounding Box Mappings (DF):")
count = 0
for img_name, bbox in train_image_bboxes.items():
print(f"{img_name}: {bbox}")
count += 1
if count > 5: break
Sample Training Image to Bounding Box Mappings (DF): 00001.jpg: (39, 116, 569, 375) 00002.jpg: (36, 116, 868, 587) 00003.jpg: (85, 109, 601, 381) 00004.jpg: (621, 393, 1484, 1096) 00005.jpg: (14, 36, 133, 99) 00006.jpg: (259, 289, 515, 416)
#Test images boundry box mapping
test_image_bboxes = map_images_to_bboxes(test_annotations_df)
# --- Print a few mappings to verify testing images---
print("\nSample Testing Image to Bounding Box Mappings (DF):")
count = 0
for img_name, bbox in test_image_bboxes.items():
print(f"{img_name}: {bbox}")
count += 1
if count > 5: break
Sample Testing Image to Bounding Box Mappings (DF): 00001.jpg: (30, 52, 246, 147) 00002.jpg: (100, 19, 576, 203) 00003.jpg: (51, 105, 968, 659) 00004.jpg: (67, 84, 581, 407) 00005.jpg: (140, 151, 593, 339) 00006.jpg: (20, 77, 420, 301)
# Display images with bounding boxes
def display_image_with_bbox(image_path, annotation):
# Load image
img = Image.open(image_path)
# Create plot
fig, ax = plt.subplots(1)
ax.imshow(img)
# Draw bounding box
x_min = row['xmin']
y_min = row['ymin']
x_max = row['xmax']
y_max = row['ymax']
image_class = row['image_class']
bbox = annotation['bbox']
rect = patches.Rectangle(
(x_min, y_min), # (x_min, y_min) - (bbox[0], bbox[1])
(x_max - x_min), # width (x_max - x_min) - bbox[2] - bbox[0]
(y_max - y_min), # height (y_max - y_min) -- bbox[3] - bbox[1]
linewidth=2,
edgecolor='r',
facecolor='none'
)
ax.add_patch(rect)
# Add class label
plt.text(
bbox[0], bbox[1] - 10, # Position of the label
annotation['image_class'],
color='red',
fontsize=12,
backgroundcolor='white'
)
plt.axis('off')
plt.show()
# Display bounding box for train images
print("For Training Images")
displayed_image_count = 0 # Initialize a counter to track displayed images
image_paths_details_training=[]
images_paths_details_testing=[]
for index, row in train_annotations_df.iterrows():
if displayed_image_count >= 5: # Check if we've already displayed two images
break # If yes, exit the loop
image_name = str(row['image_name']).strip()
image_path = None # Initialize image_path to None
for class_folder in train_class_folders:
potential_image_path = os.path.join(class_folder, image_name)
if os.path.exists(potential_image_path):
image_path = potential_image_path
image_paths_details_training.append(potential_image_path)
break # Image found, no need to check other class folders
if image_path: # If image_path is found (not None)
annotation = {
'bbox': [row['xmin'], row['ymin'], row['xmax'], row['ymax']],
'image_class' : row['image_class']
}
display_image_with_bbox(image_path, annotation)
displayed_image_count += 1 # Increment the counter
print(f"Displayed {displayed_image_count} training images with bounding boxes.")
For Training Images
Displayed 5 training images with bounding boxes.
# Display bounding box for test images
print("For Testing Images")
displayed_image_count_test = 0 # Initialize a counter to track displayed images
for index, row in test_annotations_df.iterrows(): # Use test_annotations_df DataFrame
if displayed_image_count_test >= 5: # Check if we've already displayed two images (adjust number here if you want 5 or more)
break # If yes, exit the loop
image_name_test = str(row['image_name']).strip()
image_path_test = None # Initialize image_path_test to None
for class_folder in test_class_folders: # Use test_class_folders
potential_image_path_test = os.path.join(class_folder, image_name_test)
if os.path.exists(potential_image_path_test):
image_path_test = potential_image_path_test # Assigned to image_path_test
images_paths_details_testing.append(potential_image_path)
break # Image found, no need to check other class folders
if image_path_test: # If image_path_test is found (not None)
annotation_test = {
'bbox': [row['xmin'], row['ymin'], row['xmax'], row['ymax']],
'image_class' : row['image_class'] # Assuming 'Image class' column also exists in test_annotations_df (verify!)
}
display_image_with_bbox(image_path_test, annotation_test) # Changed here
displayed_image_count_test += 1 # Increment the counter
print(f"Displayed {displayed_image_count_test} test images with bounding boxes.")
For Testing Images
Displayed 5 test images with bounding boxes.
The Models designed are:
def preprocess_image(image_path, target_size=(224, 224)):
"""
Load and preprocess an image for CNN input.
"""
# Check if the image file exists
if not os.path.exists(image_path):
print(f"Warning: Image file not found: {image_path}")
return None # Or handle the missing image in a way that makes sense for your application
image = cv2.imread(image_path) # Load image
# Check if image loading was successful
if image is None:
print(f"Warning: Failed to load image: {image_path}")
return None # Or handle the loading error as needed
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB) # Convert to RGB
image = cv2.resize(image, target_size) # Resize to target size
image = image / 255.0 # Normalize pixel values to [0, 1]
return image
def custom_generator(df, batch_size, target_size):
"""
Custom generator for images and labels.
"""
num_samples = len(df)
while True:
for offset in range(0, num_samples, batch_size):
batch_samples = df.iloc[offset:offset + batch_size]
images = []
labels = []
for _, row in batch_samples.iterrows():
image = preprocess_image(row['Image_Path'], target_size)
label = row['label_categorical']
images.append(image)
labels.append(label)
X = np.array(images, dtype=np.float32)
y = np.array(labels, dtype=np.float32)
yield X, y
# Apply preprocessing to all images
df_testing['image'] = df_testing['Image_Path'].apply(preprocess_image)
df_training['image'] = df_training['Image_Path'].apply(preprocess_image)
# Check for and handle None values in the 'image' column
df_testing = df_testing.dropna(subset=['image']) # Remove rows with None in 'image'
df_training = df_training.dropna(subset=['image']) # Remove rows with None in 'image'
# Encode labels
label_encoder = LabelEncoder()
df_testing['labels_encoded'] = label_encoder.fit_transform(df_testing['labels'])
df_training['labels_encoded'] = label_encoder.fit_transform(df_training['labels'])
# Convert labels to categorical (one-hot encoding)
df_testing['label_categorical'] = df_testing['labels_encoded'].apply(lambda x: to_categorical(x, num_classes=len(test_class_folders)))
df_training['label_categorical'] = df_training['labels_encoded'].apply(lambda x: to_categorical(x, num_classes=len(test_class_folders)))
# Split df_training into training and validation sets
df_train, df_val = train_test_split(df_training, test_size=0.2, random_state=42)
# Create generators
#batch_size = 32
batch_size = 16
train_generator = custom_generator(df_train, batch_size, target_size=(224, 224))
val_generator = custom_generator(df_val, batch_size, target_size=(224, 224)) # Use df_val for validation
# Test generator remains the same
test_generator = custom_generator(df_testing, batch_size, target_size=(224, 224))
# Check training generator
X_batch, y_batch = next(train_generator)
print("Training batch shape:", X_batch.shape, y_batch.shape)
# Check validation generator
X_batch, y_batch = next(val_generator)
print("Validation batch shape:", X_batch.shape, y_batch.shape)
Training batch shape: (16, 224, 224, 3) (16, 196) Validation batch shape: (16, 224, 224, 3) (16, 196)
#Generate classification report from a Keras/TensorFlow model using GPU-accelerated prediction.
#Assumes df_val['image'] contains pre-loaded images as np.arrays and df_val['label_categorical'] is one-hot encoded.
#returns y_val_pred, y_val_true: Predicted and true label indices
def generate_classification_report_tf_model(
model, #model
df_val, #val data frame
label_encoder, #label encoder
preprocess_fn, #preprocess_input
batch_size=32,
report_name="model_report.csv"
):
# Convert image and label columns to NumPy arrays
images = [img for img in df_val['image'] if img is not None]
labels = [label for label in df_val['label_categorical'] if label is not None]
images = np.stack(df_val['image'].values).astype(np.float32)
labels = np.stack(labels)
# Build tf.data.Dataset
dataset = tf.data.Dataset.from_tensor_slices((images, labels))
dataset = dataset.batch(batch_size).prefetch(tf.data.AUTOTUNE)
# Predict
preds = model.predict(dataset, verbose=1)
y_val_pred = np.argmax(preds, axis=1)
y_val_true = np.argmax(labels, axis=1)
# Evaluation
acc = accuracy_score(y_val_true, y_val_pred)
print(f"Model Accuracy: {acc:.4f}\n")
print("Classification Report:")
# Save as CSV
report = classification_report(
y_val_true, y_val_pred,
target_names=label_encoder.classes_,
output_dict=True,
zero_division=1
)
df_report = pd.DataFrame(report).transpose()
df_report.loc["overall_accuracy"] = [acc, None, None, None]
df_report.to_csv(report_name)
print(f"Report saved as: {report_name}")
# Print only the average metrics
print(f"Model Accuracy: {acc:.4f}")
print("Average Summary Metrics:")
print(df_report.loc[["macro avg", "weighted avg", "overall_accuracy"]][["precision", "recall", "f1-score"]])
return y_val_pred, y_val_true, df_report
def plot_training_history(history):
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))
# Plot accuracy
ax1.plot(history.history['accuracy'], label='Training Accuracy')
ax1.plot(history.history['val_accuracy'], label='Validation Accuracy')
ax1.set_title('Model Accuracy')
ax1.set_xlabel('Epochs')
ax1.set_ylabel('Accuracy')
ax1.legend()
# Plot loss
ax2.plot(history.history['loss'], label='Training Loss')
ax2.plot(history.history['val_loss'], label='Validation Loss')
ax2.set_title('Model Loss')
ax2.set_xlabel('Epochs')
ax2.set_ylabel('Loss')
ax2.legend()
plt.show()
6A. MobileNetV2
base_model = MobileNetV2(weights='imagenet', include_top=False, input_shape=(224, 224, 3),classes=196) # Use 128x128 for speed
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224_no_top.h5 9406464/9406464 [==============================] - 0s 0us/step
# Freeze all but last 4 layers for efficient training
for layer in base_model.layers[:-4]:
layer.trainable = False
# Add custom classification layers
x = base_model.output
x = GlobalAveragePooling2D()(x) # Reduces parameters
x = BatchNormalization()(x) # Stabilizes training
x = Dense(128, activation='relu')(x)
x = Dropout(0.3)(x) # Dropout for regularization
predictions = Dense(len(label_encoder.classes_), activation='softmax')(x) # Output layer
#Split 80-20 of train images
df_train_mobilenet, df_val_mobilenet = train_test_split(df_training, test_size=0.2, random_state=42)
mobilenet_batch_size=16
#df_train_mobilenet_gen = custom_generator(df_train_mobilenet,mobilenet_batch_size,target_size=(128,128))
#df_val_mobilenet_gen = custom_generator(df_val_mobilenet,mobilenet_batch_size,target_size=(128,128))
df_train_mobilenet_gen = custom_generator(df_train_mobilenet,mobilenet_batch_size,target_size=(224,224))
df_val_mobilenet_gen = custom_generator(df_val_mobilenet,mobilenet_batch_size,target_size=(224,224))
# Create the model
mobilenet_model = Model(inputs=base_model.input, outputs=predictions)
# Compile the model
mobilenet_model.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['accuracy'])
mobilenet_model.summary()
# Define steps per epoch
steps_per_epoch = np.ceil(len(df_train_mobilenet) / mobilenet_batch_size).astype(int)
validation_steps = np.ceil(len(df_val_mobilenet) / mobilenet_batch_size).astype(int)
y_true = np.array(df_train_mobilenet['labels_encoded'].tolist())
# Compute class weights based on actual class distribution
class_weights = compute_class_weight('balanced', classes=np.unique(y_true), y=y_true)
class_weight_dict = dict(enumerate(class_weights))
#predicting
history_mobilenet = mobilenet_model.fit(
df_train_mobilenet_gen,
steps_per_epoch=steps_per_epoch,
validation_data=df_val_mobilenet_gen,
validation_steps=validation_steps,
epochs=10 # Reduce epochs to speed up training
)
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 224, 224, 3)] 0 []
Conv1 (Conv2D) (None, 112, 112, 32) 864 ['input_1[0][0]']
bn_Conv1 (BatchNormalizati (None, 112, 112, 32) 128 ['Conv1[0][0]']
on)
Conv1_relu (ReLU) (None, 112, 112, 32) 0 ['bn_Conv1[0][0]']
expanded_conv_depthwise (D (None, 112, 112, 32) 288 ['Conv1_relu[0][0]']
epthwiseConv2D)
expanded_conv_depthwise_BN (None, 112, 112, 32) 128 ['expanded_conv_depthwise[0][0
(BatchNormalization) ]']
expanded_conv_depthwise_re (None, 112, 112, 32) 0 ['expanded_conv_depthwise_BN[0
lu (ReLU) ][0]']
expanded_conv_project (Con (None, 112, 112, 16) 512 ['expanded_conv_depthwise_relu
v2D) [0][0]']
expanded_conv_project_BN ( (None, 112, 112, 16) 64 ['expanded_conv_project[0][0]'
BatchNormalization) ]
block_1_expand (Conv2D) (None, 112, 112, 96) 1536 ['expanded_conv_project_BN[0][
0]']
block_1_expand_BN (BatchNo (None, 112, 112, 96) 384 ['block_1_expand[0][0]']
rmalization)
block_1_expand_relu (ReLU) (None, 112, 112, 96) 0 ['block_1_expand_BN[0][0]']
block_1_pad (ZeroPadding2D (None, 113, 113, 96) 0 ['block_1_expand_relu[0][0]']
)
block_1_depthwise (Depthwi (None, 56, 56, 96) 864 ['block_1_pad[0][0]']
seConv2D)
block_1_depthwise_BN (Batc (None, 56, 56, 96) 384 ['block_1_depthwise[0][0]']
hNormalization)
block_1_depthwise_relu (Re (None, 56, 56, 96) 0 ['block_1_depthwise_BN[0][0]']
LU)
block_1_project (Conv2D) (None, 56, 56, 24) 2304 ['block_1_depthwise_relu[0][0]
']
block_1_project_BN (BatchN (None, 56, 56, 24) 96 ['block_1_project[0][0]']
ormalization)
block_2_expand (Conv2D) (None, 56, 56, 144) 3456 ['block_1_project_BN[0][0]']
block_2_expand_BN (BatchNo (None, 56, 56, 144) 576 ['block_2_expand[0][0]']
rmalization)
block_2_expand_relu (ReLU) (None, 56, 56, 144) 0 ['block_2_expand_BN[0][0]']
block_2_depthwise (Depthwi (None, 56, 56, 144) 1296 ['block_2_expand_relu[0][0]']
seConv2D)
block_2_depthwise_BN (Batc (None, 56, 56, 144) 576 ['block_2_depthwise[0][0]']
hNormalization)
block_2_depthwise_relu (Re (None, 56, 56, 144) 0 ['block_2_depthwise_BN[0][0]']
LU)
block_2_project (Conv2D) (None, 56, 56, 24) 3456 ['block_2_depthwise_relu[0][0]
']
block_2_project_BN (BatchN (None, 56, 56, 24) 96 ['block_2_project[0][0]']
ormalization)
block_2_add (Add) (None, 56, 56, 24) 0 ['block_1_project_BN[0][0]',
'block_2_project_BN[0][0]']
block_3_expand (Conv2D) (None, 56, 56, 144) 3456 ['block_2_add[0][0]']
block_3_expand_BN (BatchNo (None, 56, 56, 144) 576 ['block_3_expand[0][0]']
rmalization)
block_3_expand_relu (ReLU) (None, 56, 56, 144) 0 ['block_3_expand_BN[0][0]']
block_3_pad (ZeroPadding2D (None, 57, 57, 144) 0 ['block_3_expand_relu[0][0]']
)
block_3_depthwise (Depthwi (None, 28, 28, 144) 1296 ['block_3_pad[0][0]']
seConv2D)
block_3_depthwise_BN (Batc (None, 28, 28, 144) 576 ['block_3_depthwise[0][0]']
hNormalization)
block_3_depthwise_relu (Re (None, 28, 28, 144) 0 ['block_3_depthwise_BN[0][0]']
LU)
block_3_project (Conv2D) (None, 28, 28, 32) 4608 ['block_3_depthwise_relu[0][0]
']
block_3_project_BN (BatchN (None, 28, 28, 32) 128 ['block_3_project[0][0]']
ormalization)
block_4_expand (Conv2D) (None, 28, 28, 192) 6144 ['block_3_project_BN[0][0]']
block_4_expand_BN (BatchNo (None, 28, 28, 192) 768 ['block_4_expand[0][0]']
rmalization)
block_4_expand_relu (ReLU) (None, 28, 28, 192) 0 ['block_4_expand_BN[0][0]']
block_4_depthwise (Depthwi (None, 28, 28, 192) 1728 ['block_4_expand_relu[0][0]']
seConv2D)
block_4_depthwise_BN (Batc (None, 28, 28, 192) 768 ['block_4_depthwise[0][0]']
hNormalization)
block_4_depthwise_relu (Re (None, 28, 28, 192) 0 ['block_4_depthwise_BN[0][0]']
LU)
block_4_project (Conv2D) (None, 28, 28, 32) 6144 ['block_4_depthwise_relu[0][0]
']
block_4_project_BN (BatchN (None, 28, 28, 32) 128 ['block_4_project[0][0]']
ormalization)
block_4_add (Add) (None, 28, 28, 32) 0 ['block_3_project_BN[0][0]',
'block_4_project_BN[0][0]']
block_5_expand (Conv2D) (None, 28, 28, 192) 6144 ['block_4_add[0][0]']
block_5_expand_BN (BatchNo (None, 28, 28, 192) 768 ['block_5_expand[0][0]']
rmalization)
block_5_expand_relu (ReLU) (None, 28, 28, 192) 0 ['block_5_expand_BN[0][0]']
block_5_depthwise (Depthwi (None, 28, 28, 192) 1728 ['block_5_expand_relu[0][0]']
seConv2D)
block_5_depthwise_BN (Batc (None, 28, 28, 192) 768 ['block_5_depthwise[0][0]']
hNormalization)
block_5_depthwise_relu (Re (None, 28, 28, 192) 0 ['block_5_depthwise_BN[0][0]']
LU)
block_5_project (Conv2D) (None, 28, 28, 32) 6144 ['block_5_depthwise_relu[0][0]
']
block_5_project_BN (BatchN (None, 28, 28, 32) 128 ['block_5_project[0][0]']
ormalization)
block_5_add (Add) (None, 28, 28, 32) 0 ['block_4_add[0][0]',
'block_5_project_BN[0][0]']
block_6_expand (Conv2D) (None, 28, 28, 192) 6144 ['block_5_add[0][0]']
block_6_expand_BN (BatchNo (None, 28, 28, 192) 768 ['block_6_expand[0][0]']
rmalization)
block_6_expand_relu (ReLU) (None, 28, 28, 192) 0 ['block_6_expand_BN[0][0]']
block_6_pad (ZeroPadding2D (None, 29, 29, 192) 0 ['block_6_expand_relu[0][0]']
)
block_6_depthwise (Depthwi (None, 14, 14, 192) 1728 ['block_6_pad[0][0]']
seConv2D)
block_6_depthwise_BN (Batc (None, 14, 14, 192) 768 ['block_6_depthwise[0][0]']
hNormalization)
block_6_depthwise_relu (Re (None, 14, 14, 192) 0 ['block_6_depthwise_BN[0][0]']
LU)
block_6_project (Conv2D) (None, 14, 14, 64) 12288 ['block_6_depthwise_relu[0][0]
']
block_6_project_BN (BatchN (None, 14, 14, 64) 256 ['block_6_project[0][0]']
ormalization)
block_7_expand (Conv2D) (None, 14, 14, 384) 24576 ['block_6_project_BN[0][0]']
block_7_expand_BN (BatchNo (None, 14, 14, 384) 1536 ['block_7_expand[0][0]']
rmalization)
block_7_expand_relu (ReLU) (None, 14, 14, 384) 0 ['block_7_expand_BN[0][0]']
block_7_depthwise (Depthwi (None, 14, 14, 384) 3456 ['block_7_expand_relu[0][0]']
seConv2D)
block_7_depthwise_BN (Batc (None, 14, 14, 384) 1536 ['block_7_depthwise[0][0]']
hNormalization)
block_7_depthwise_relu (Re (None, 14, 14, 384) 0 ['block_7_depthwise_BN[0][0]']
LU)
block_7_project (Conv2D) (None, 14, 14, 64) 24576 ['block_7_depthwise_relu[0][0]
']
block_7_project_BN (BatchN (None, 14, 14, 64) 256 ['block_7_project[0][0]']
ormalization)
block_7_add (Add) (None, 14, 14, 64) 0 ['block_6_project_BN[0][0]',
'block_7_project_BN[0][0]']
block_8_expand (Conv2D) (None, 14, 14, 384) 24576 ['block_7_add[0][0]']
block_8_expand_BN (BatchNo (None, 14, 14, 384) 1536 ['block_8_expand[0][0]']
rmalization)
block_8_expand_relu (ReLU) (None, 14, 14, 384) 0 ['block_8_expand_BN[0][0]']
block_8_depthwise (Depthwi (None, 14, 14, 384) 3456 ['block_8_expand_relu[0][0]']
seConv2D)
block_8_depthwise_BN (Batc (None, 14, 14, 384) 1536 ['block_8_depthwise[0][0]']
hNormalization)
block_8_depthwise_relu (Re (None, 14, 14, 384) 0 ['block_8_depthwise_BN[0][0]']
LU)
block_8_project (Conv2D) (None, 14, 14, 64) 24576 ['block_8_depthwise_relu[0][0]
']
block_8_project_BN (BatchN (None, 14, 14, 64) 256 ['block_8_project[0][0]']
ormalization)
block_8_add (Add) (None, 14, 14, 64) 0 ['block_7_add[0][0]',
'block_8_project_BN[0][0]']
block_9_expand (Conv2D) (None, 14, 14, 384) 24576 ['block_8_add[0][0]']
block_9_expand_BN (BatchNo (None, 14, 14, 384) 1536 ['block_9_expand[0][0]']
rmalization)
block_9_expand_relu (ReLU) (None, 14, 14, 384) 0 ['block_9_expand_BN[0][0]']
block_9_depthwise (Depthwi (None, 14, 14, 384) 3456 ['block_9_expand_relu[0][0]']
seConv2D)
block_9_depthwise_BN (Batc (None, 14, 14, 384) 1536 ['block_9_depthwise[0][0]']
hNormalization)
block_9_depthwise_relu (Re (None, 14, 14, 384) 0 ['block_9_depthwise_BN[0][0]']
LU)
block_9_project (Conv2D) (None, 14, 14, 64) 24576 ['block_9_depthwise_relu[0][0]
']
block_9_project_BN (BatchN (None, 14, 14, 64) 256 ['block_9_project[0][0]']
ormalization)
block_9_add (Add) (None, 14, 14, 64) 0 ['block_8_add[0][0]',
'block_9_project_BN[0][0]']
block_10_expand (Conv2D) (None, 14, 14, 384) 24576 ['block_9_add[0][0]']
block_10_expand_BN (BatchN (None, 14, 14, 384) 1536 ['block_10_expand[0][0]']
ormalization)
block_10_expand_relu (ReLU (None, 14, 14, 384) 0 ['block_10_expand_BN[0][0]']
)
block_10_depthwise (Depthw (None, 14, 14, 384) 3456 ['block_10_expand_relu[0][0]']
iseConv2D)
block_10_depthwise_BN (Bat (None, 14, 14, 384) 1536 ['block_10_depthwise[0][0]']
chNormalization)
block_10_depthwise_relu (R (None, 14, 14, 384) 0 ['block_10_depthwise_BN[0][0]'
eLU) ]
block_10_project (Conv2D) (None, 14, 14, 96) 36864 ['block_10_depthwise_relu[0][0
]']
block_10_project_BN (Batch (None, 14, 14, 96) 384 ['block_10_project[0][0]']
Normalization)
block_11_expand (Conv2D) (None, 14, 14, 576) 55296 ['block_10_project_BN[0][0]']
block_11_expand_BN (BatchN (None, 14, 14, 576) 2304 ['block_11_expand[0][0]']
ormalization)
block_11_expand_relu (ReLU (None, 14, 14, 576) 0 ['block_11_expand_BN[0][0]']
)
block_11_depthwise (Depthw (None, 14, 14, 576) 5184 ['block_11_expand_relu[0][0]']
iseConv2D)
block_11_depthwise_BN (Bat (None, 14, 14, 576) 2304 ['block_11_depthwise[0][0]']
chNormalization)
block_11_depthwise_relu (R (None, 14, 14, 576) 0 ['block_11_depthwise_BN[0][0]'
eLU) ]
block_11_project (Conv2D) (None, 14, 14, 96) 55296 ['block_11_depthwise_relu[0][0
]']
block_11_project_BN (Batch (None, 14, 14, 96) 384 ['block_11_project[0][0]']
Normalization)
block_11_add (Add) (None, 14, 14, 96) 0 ['block_10_project_BN[0][0]',
'block_11_project_BN[0][0]']
block_12_expand (Conv2D) (None, 14, 14, 576) 55296 ['block_11_add[0][0]']
block_12_expand_BN (BatchN (None, 14, 14, 576) 2304 ['block_12_expand[0][0]']
ormalization)
block_12_expand_relu (ReLU (None, 14, 14, 576) 0 ['block_12_expand_BN[0][0]']
)
block_12_depthwise (Depthw (None, 14, 14, 576) 5184 ['block_12_expand_relu[0][0]']
iseConv2D)
block_12_depthwise_BN (Bat (None, 14, 14, 576) 2304 ['block_12_depthwise[0][0]']
chNormalization)
block_12_depthwise_relu (R (None, 14, 14, 576) 0 ['block_12_depthwise_BN[0][0]'
eLU) ]
block_12_project (Conv2D) (None, 14, 14, 96) 55296 ['block_12_depthwise_relu[0][0
]']
block_12_project_BN (Batch (None, 14, 14, 96) 384 ['block_12_project[0][0]']
Normalization)
block_12_add (Add) (None, 14, 14, 96) 0 ['block_11_add[0][0]',
'block_12_project_BN[0][0]']
block_13_expand (Conv2D) (None, 14, 14, 576) 55296 ['block_12_add[0][0]']
block_13_expand_BN (BatchN (None, 14, 14, 576) 2304 ['block_13_expand[0][0]']
ormalization)
block_13_expand_relu (ReLU (None, 14, 14, 576) 0 ['block_13_expand_BN[0][0]']
)
block_13_pad (ZeroPadding2 (None, 15, 15, 576) 0 ['block_13_expand_relu[0][0]']
D)
block_13_depthwise (Depthw (None, 7, 7, 576) 5184 ['block_13_pad[0][0]']
iseConv2D)
block_13_depthwise_BN (Bat (None, 7, 7, 576) 2304 ['block_13_depthwise[0][0]']
chNormalization)
block_13_depthwise_relu (R (None, 7, 7, 576) 0 ['block_13_depthwise_BN[0][0]'
eLU) ]
block_13_project (Conv2D) (None, 7, 7, 160) 92160 ['block_13_depthwise_relu[0][0
]']
block_13_project_BN (Batch (None, 7, 7, 160) 640 ['block_13_project[0][0]']
Normalization)
block_14_expand (Conv2D) (None, 7, 7, 960) 153600 ['block_13_project_BN[0][0]']
block_14_expand_BN (BatchN (None, 7, 7, 960) 3840 ['block_14_expand[0][0]']
ormalization)
block_14_expand_relu (ReLU (None, 7, 7, 960) 0 ['block_14_expand_BN[0][0]']
)
block_14_depthwise (Depthw (None, 7, 7, 960) 8640 ['block_14_expand_relu[0][0]']
iseConv2D)
block_14_depthwise_BN (Bat (None, 7, 7, 960) 3840 ['block_14_depthwise[0][0]']
chNormalization)
block_14_depthwise_relu (R (None, 7, 7, 960) 0 ['block_14_depthwise_BN[0][0]'
eLU) ]
block_14_project (Conv2D) (None, 7, 7, 160) 153600 ['block_14_depthwise_relu[0][0
]']
block_14_project_BN (Batch (None, 7, 7, 160) 640 ['block_14_project[0][0]']
Normalization)
block_14_add (Add) (None, 7, 7, 160) 0 ['block_13_project_BN[0][0]',
'block_14_project_BN[0][0]']
block_15_expand (Conv2D) (None, 7, 7, 960) 153600 ['block_14_add[0][0]']
block_15_expand_BN (BatchN (None, 7, 7, 960) 3840 ['block_15_expand[0][0]']
ormalization)
block_15_expand_relu (ReLU (None, 7, 7, 960) 0 ['block_15_expand_BN[0][0]']
)
block_15_depthwise (Depthw (None, 7, 7, 960) 8640 ['block_15_expand_relu[0][0]']
iseConv2D)
block_15_depthwise_BN (Bat (None, 7, 7, 960) 3840 ['block_15_depthwise[0][0]']
chNormalization)
block_15_depthwise_relu (R (None, 7, 7, 960) 0 ['block_15_depthwise_BN[0][0]'
eLU) ]
block_15_project (Conv2D) (None, 7, 7, 160) 153600 ['block_15_depthwise_relu[0][0
]']
block_15_project_BN (Batch (None, 7, 7, 160) 640 ['block_15_project[0][0]']
Normalization)
block_15_add (Add) (None, 7, 7, 160) 0 ['block_14_add[0][0]',
'block_15_project_BN[0][0]']
block_16_expand (Conv2D) (None, 7, 7, 960) 153600 ['block_15_add[0][0]']
block_16_expand_BN (BatchN (None, 7, 7, 960) 3840 ['block_16_expand[0][0]']
ormalization)
block_16_expand_relu (ReLU (None, 7, 7, 960) 0 ['block_16_expand_BN[0][0]']
)
block_16_depthwise (Depthw (None, 7, 7, 960) 8640 ['block_16_expand_relu[0][0]']
iseConv2D)
block_16_depthwise_BN (Bat (None, 7, 7, 960) 3840 ['block_16_depthwise[0][0]']
chNormalization)
block_16_depthwise_relu (R (None, 7, 7, 960) 0 ['block_16_depthwise_BN[0][0]'
eLU) ]
block_16_project (Conv2D) (None, 7, 7, 320) 307200 ['block_16_depthwise_relu[0][0
]']
block_16_project_BN (Batch (None, 7, 7, 320) 1280 ['block_16_project[0][0]']
Normalization)
Conv_1 (Conv2D) (None, 7, 7, 1280) 409600 ['block_16_project_BN[0][0]']
Conv_1_bn (BatchNormalizat (None, 7, 7, 1280) 5120 ['Conv_1[0][0]']
ion)
out_relu (ReLU) (None, 7, 7, 1280) 0 ['Conv_1_bn[0][0]']
global_average_pooling2d ( (None, 1280) 0 ['out_relu[0][0]']
GlobalAveragePooling2D)
batch_normalization (Batch (None, 1280) 5120 ['global_average_pooling2d[0][
Normalization) 0]']
dense (Dense) (None, 128) 163968 ['batch_normalization[0][0]']
dropout (Dropout) (None, 128) 0 ['dense[0][0]']
dense_1 (Dense) (None, 196) 25284 ['dropout[0][0]']
==================================================================================================
Total params: 2452356 (9.35 MB)
Trainable params: 604612 (2.31 MB)
Non-trainable params: 1847744 (7.05 MB)
__________________________________________________________________________________________________
Epoch 1/10
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1742822137.282010 13818 service.cc:145] XLA service 0x7fbf4d32a9a0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices: I0000 00:00:1742822137.282051 13818 service.cc:153] StreamExecutor device (0): NVIDIA A10G, Compute Capability 8.6 I0000 00:00:1742822137.899225 13818 device_compiler.h:188] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.
408/408 [==============================] - 99s 82ms/step - loss: 5.3200 - accuracy: 0.0189 - val_loss: 4.8660 - val_accuracy: 0.0442 Epoch 2/10 408/408 [==============================] - 26s 65ms/step - loss: 4.3483 - accuracy: 0.1053 - val_loss: 4.3688 - val_accuracy: 0.1001 Epoch 3/10 408/408 [==============================] - 26s 64ms/step - loss: 3.6912 - accuracy: 0.2111 - val_loss: 3.9805 - val_accuracy: 0.1572 Epoch 4/10 408/408 [==============================] - 26s 63ms/step - loss: 3.1520 - accuracy: 0.3137 - val_loss: 3.6897 - val_accuracy: 0.1989 Epoch 5/10 408/408 [==============================] - 25s 62ms/step - loss: 2.7204 - accuracy: 0.3994 - val_loss: 3.4624 - val_accuracy: 0.2474 Epoch 6/10 408/408 [==============================] - 25s 60ms/step - loss: 2.3169 - accuracy: 0.4936 - val_loss: 3.3009 - val_accuracy: 0.2799 Epoch 7/10 408/408 [==============================] - 23s 57ms/step - loss: 1.9865 - accuracy: 0.5725 - val_loss: 3.1779 - val_accuracy: 0.2916 Epoch 8/10 408/408 [==============================] - 21s 53ms/step - loss: 1.6952 - accuracy: 0.6447 - val_loss: 3.0831 - val_accuracy: 0.2928 Epoch 9/10 408/408 [==============================] - 21s 53ms/step - loss: 1.4385 - accuracy: 0.6993 - val_loss: 3.0007 - val_accuracy: 0.3112 Epoch 10/10 408/408 [==============================] - 21s 52ms/step - loss: 1.2323 - accuracy: 0.7493 - val_loss: 2.9435 - val_accuracy: 0.3143
#display model accurance and loss
plot_training_history(history_mobilenet)
y_pred, y_true,df_mobilenet_classification_report = generate_classification_report_tf_model(
model=mobilenet_model,
df_val=df_val_mobilenet,
label_encoder=label_encoder,
preprocess_fn=mobilenet_preprocess,
batch_size=32,
report_name="mobilenet_classification_report.csv"
)
51/51 [==============================] - 5s 20ms/step
Model Accuracy: 0.3137
Classification Report:
Report saved as: mobilenet_classification_report.csv
Model Accuracy: 0.3137
Average Summary Metrics:
precision recall f1-score
macro avg 0.315383 0.318741 0.294248
weighted avg 0.342332 0.313689 0.306350
overall_accuracy 0.313689 NaN NaN
Displaying only top 10 class names
df_support = df_mobilenet_classification_report.iloc[:-3] # exclude average rows
top_10_classes = df_support.sort_values("support", ascending=False).head(10).index.tolist()
top_10_indices = [np.where(label_encoder.classes_ == cls)[0][0] for cls in top_10_classes]
mobilenet_cm = confusion_matrix(y_true, y_pred)
mobilenet_cm_top10 = mobilenet_cm[np.ix_(top_10_indices, top_10_indices)]
plt.figure(figsize=(10, 8))
sns.heatmap(mobilenet_cm_top10, annot=True, fmt='d',
xticklabels=top_10_classes, yticklabels=top_10_classes,
cmap='Blues')
plt.title("Mobile Net Confusion Matrix (Top 10 Classes)")
plt.xlabel("Predicted")
plt.ylabel("True")
plt.tight_layout()
plt.show()
Next Steps
6B. GoogleNet
base_model = InceptionV3(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
# Add custom layers on top of the base model
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(196, activation='softmax')(x)
# Define the complete model
googlenet_model = Model(inputs=base_model.input, outputs=predictions)
# Freeze the layers of the base model
for layer in base_model.layers:
layer.trainable = False
# Compile the model
googlenet_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/inception_v3/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5 87910968/87910968 [==============================] - 7s 0us/step
googlenet_model.summary()
Model: "model_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_2 (InputLayer) [(None, 224, 224, 3)] 0 []
conv2d (Conv2D) (None, 111, 111, 32) 864 ['input_2[0][0]']
batch_normalization_1 (Bat (None, 111, 111, 32) 96 ['conv2d[0][0]']
chNormalization)
activation (Activation) (None, 111, 111, 32) 0 ['batch_normalization_1[0][0]'
]
conv2d_1 (Conv2D) (None, 109, 109, 32) 9216 ['activation[0][0]']
batch_normalization_2 (Bat (None, 109, 109, 32) 96 ['conv2d_1[0][0]']
chNormalization)
activation_1 (Activation) (None, 109, 109, 32) 0 ['batch_normalization_2[0][0]'
]
conv2d_2 (Conv2D) (None, 109, 109, 64) 18432 ['activation_1[0][0]']
batch_normalization_3 (Bat (None, 109, 109, 64) 192 ['conv2d_2[0][0]']
chNormalization)
activation_2 (Activation) (None, 109, 109, 64) 0 ['batch_normalization_3[0][0]'
]
max_pooling2d (MaxPooling2 (None, 54, 54, 64) 0 ['activation_2[0][0]']
D)
conv2d_3 (Conv2D) (None, 54, 54, 80) 5120 ['max_pooling2d[0][0]']
batch_normalization_4 (Bat (None, 54, 54, 80) 240 ['conv2d_3[0][0]']
chNormalization)
activation_3 (Activation) (None, 54, 54, 80) 0 ['batch_normalization_4[0][0]'
]
conv2d_4 (Conv2D) (None, 52, 52, 192) 138240 ['activation_3[0][0]']
batch_normalization_5 (Bat (None, 52, 52, 192) 576 ['conv2d_4[0][0]']
chNormalization)
activation_4 (Activation) (None, 52, 52, 192) 0 ['batch_normalization_5[0][0]'
]
max_pooling2d_1 (MaxPoolin (None, 25, 25, 192) 0 ['activation_4[0][0]']
g2D)
conv2d_8 (Conv2D) (None, 25, 25, 64) 12288 ['max_pooling2d_1[0][0]']
batch_normalization_9 (Bat (None, 25, 25, 64) 192 ['conv2d_8[0][0]']
chNormalization)
activation_8 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_9[0][0]'
]
conv2d_6 (Conv2D) (None, 25, 25, 48) 9216 ['max_pooling2d_1[0][0]']
conv2d_9 (Conv2D) (None, 25, 25, 96) 55296 ['activation_8[0][0]']
batch_normalization_7 (Bat (None, 25, 25, 48) 144 ['conv2d_6[0][0]']
chNormalization)
batch_normalization_10 (Ba (None, 25, 25, 96) 288 ['conv2d_9[0][0]']
tchNormalization)
activation_6 (Activation) (None, 25, 25, 48) 0 ['batch_normalization_7[0][0]'
]
activation_9 (Activation) (None, 25, 25, 96) 0 ['batch_normalization_10[0][0]
']
average_pooling2d (Average (None, 25, 25, 192) 0 ['max_pooling2d_1[0][0]']
Pooling2D)
conv2d_5 (Conv2D) (None, 25, 25, 64) 12288 ['max_pooling2d_1[0][0]']
conv2d_7 (Conv2D) (None, 25, 25, 64) 76800 ['activation_6[0][0]']
conv2d_10 (Conv2D) (None, 25, 25, 96) 82944 ['activation_9[0][0]']
conv2d_11 (Conv2D) (None, 25, 25, 32) 6144 ['average_pooling2d[0][0]']
batch_normalization_6 (Bat (None, 25, 25, 64) 192 ['conv2d_5[0][0]']
chNormalization)
batch_normalization_8 (Bat (None, 25, 25, 64) 192 ['conv2d_7[0][0]']
chNormalization)
batch_normalization_11 (Ba (None, 25, 25, 96) 288 ['conv2d_10[0][0]']
tchNormalization)
batch_normalization_12 (Ba (None, 25, 25, 32) 96 ['conv2d_11[0][0]']
tchNormalization)
activation_5 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_6[0][0]'
]
activation_7 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_8[0][0]'
]
activation_10 (Activation) (None, 25, 25, 96) 0 ['batch_normalization_11[0][0]
']
activation_11 (Activation) (None, 25, 25, 32) 0 ['batch_normalization_12[0][0]
']
mixed0 (Concatenate) (None, 25, 25, 256) 0 ['activation_5[0][0]',
'activation_7[0][0]',
'activation_10[0][0]',
'activation_11[0][0]']
conv2d_15 (Conv2D) (None, 25, 25, 64) 16384 ['mixed0[0][0]']
batch_normalization_16 (Ba (None, 25, 25, 64) 192 ['conv2d_15[0][0]']
tchNormalization)
activation_15 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_16[0][0]
']
conv2d_13 (Conv2D) (None, 25, 25, 48) 12288 ['mixed0[0][0]']
conv2d_16 (Conv2D) (None, 25, 25, 96) 55296 ['activation_15[0][0]']
batch_normalization_14 (Ba (None, 25, 25, 48) 144 ['conv2d_13[0][0]']
tchNormalization)
batch_normalization_17 (Ba (None, 25, 25, 96) 288 ['conv2d_16[0][0]']
tchNormalization)
activation_13 (Activation) (None, 25, 25, 48) 0 ['batch_normalization_14[0][0]
']
activation_16 (Activation) (None, 25, 25, 96) 0 ['batch_normalization_17[0][0]
']
average_pooling2d_1 (Avera (None, 25, 25, 256) 0 ['mixed0[0][0]']
gePooling2D)
conv2d_12 (Conv2D) (None, 25, 25, 64) 16384 ['mixed0[0][0]']
conv2d_14 (Conv2D) (None, 25, 25, 64) 76800 ['activation_13[0][0]']
conv2d_17 (Conv2D) (None, 25, 25, 96) 82944 ['activation_16[0][0]']
conv2d_18 (Conv2D) (None, 25, 25, 64) 16384 ['average_pooling2d_1[0][0]']
batch_normalization_13 (Ba (None, 25, 25, 64) 192 ['conv2d_12[0][0]']
tchNormalization)
batch_normalization_15 (Ba (None, 25, 25, 64) 192 ['conv2d_14[0][0]']
tchNormalization)
batch_normalization_18 (Ba (None, 25, 25, 96) 288 ['conv2d_17[0][0]']
tchNormalization)
batch_normalization_19 (Ba (None, 25, 25, 64) 192 ['conv2d_18[0][0]']
tchNormalization)
activation_12 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_13[0][0]
']
activation_14 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_15[0][0]
']
activation_17 (Activation) (None, 25, 25, 96) 0 ['batch_normalization_18[0][0]
']
activation_18 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_19[0][0]
']
mixed1 (Concatenate) (None, 25, 25, 288) 0 ['activation_12[0][0]',
'activation_14[0][0]',
'activation_17[0][0]',
'activation_18[0][0]']
conv2d_22 (Conv2D) (None, 25, 25, 64) 18432 ['mixed1[0][0]']
batch_normalization_23 (Ba (None, 25, 25, 64) 192 ['conv2d_22[0][0]']
tchNormalization)
activation_22 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_23[0][0]
']
conv2d_20 (Conv2D) (None, 25, 25, 48) 13824 ['mixed1[0][0]']
conv2d_23 (Conv2D) (None, 25, 25, 96) 55296 ['activation_22[0][0]']
batch_normalization_21 (Ba (None, 25, 25, 48) 144 ['conv2d_20[0][0]']
tchNormalization)
batch_normalization_24 (Ba (None, 25, 25, 96) 288 ['conv2d_23[0][0]']
tchNormalization)
activation_20 (Activation) (None, 25, 25, 48) 0 ['batch_normalization_21[0][0]
']
activation_23 (Activation) (None, 25, 25, 96) 0 ['batch_normalization_24[0][0]
']
average_pooling2d_2 (Avera (None, 25, 25, 288) 0 ['mixed1[0][0]']
gePooling2D)
conv2d_19 (Conv2D) (None, 25, 25, 64) 18432 ['mixed1[0][0]']
conv2d_21 (Conv2D) (None, 25, 25, 64) 76800 ['activation_20[0][0]']
conv2d_24 (Conv2D) (None, 25, 25, 96) 82944 ['activation_23[0][0]']
conv2d_25 (Conv2D) (None, 25, 25, 64) 18432 ['average_pooling2d_2[0][0]']
batch_normalization_20 (Ba (None, 25, 25, 64) 192 ['conv2d_19[0][0]']
tchNormalization)
batch_normalization_22 (Ba (None, 25, 25, 64) 192 ['conv2d_21[0][0]']
tchNormalization)
batch_normalization_25 (Ba (None, 25, 25, 96) 288 ['conv2d_24[0][0]']
tchNormalization)
batch_normalization_26 (Ba (None, 25, 25, 64) 192 ['conv2d_25[0][0]']
tchNormalization)
activation_19 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_20[0][0]
']
activation_21 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_22[0][0]
']
activation_24 (Activation) (None, 25, 25, 96) 0 ['batch_normalization_25[0][0]
']
activation_25 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_26[0][0]
']
mixed2 (Concatenate) (None, 25, 25, 288) 0 ['activation_19[0][0]',
'activation_21[0][0]',
'activation_24[0][0]',
'activation_25[0][0]']
conv2d_27 (Conv2D) (None, 25, 25, 64) 18432 ['mixed2[0][0]']
batch_normalization_28 (Ba (None, 25, 25, 64) 192 ['conv2d_27[0][0]']
tchNormalization)
activation_27 (Activation) (None, 25, 25, 64) 0 ['batch_normalization_28[0][0]
']
conv2d_28 (Conv2D) (None, 25, 25, 96) 55296 ['activation_27[0][0]']
batch_normalization_29 (Ba (None, 25, 25, 96) 288 ['conv2d_28[0][0]']
tchNormalization)
activation_28 (Activation) (None, 25, 25, 96) 0 ['batch_normalization_29[0][0]
']
conv2d_26 (Conv2D) (None, 12, 12, 384) 995328 ['mixed2[0][0]']
conv2d_29 (Conv2D) (None, 12, 12, 96) 82944 ['activation_28[0][0]']
batch_normalization_27 (Ba (None, 12, 12, 384) 1152 ['conv2d_26[0][0]']
tchNormalization)
batch_normalization_30 (Ba (None, 12, 12, 96) 288 ['conv2d_29[0][0]']
tchNormalization)
activation_26 (Activation) (None, 12, 12, 384) 0 ['batch_normalization_27[0][0]
']
activation_29 (Activation) (None, 12, 12, 96) 0 ['batch_normalization_30[0][0]
']
max_pooling2d_2 (MaxPoolin (None, 12, 12, 288) 0 ['mixed2[0][0]']
g2D)
mixed3 (Concatenate) (None, 12, 12, 768) 0 ['activation_26[0][0]',
'activation_29[0][0]',
'max_pooling2d_2[0][0]']
conv2d_34 (Conv2D) (None, 12, 12, 128) 98304 ['mixed3[0][0]']
batch_normalization_35 (Ba (None, 12, 12, 128) 384 ['conv2d_34[0][0]']
tchNormalization)
activation_34 (Activation) (None, 12, 12, 128) 0 ['batch_normalization_35[0][0]
']
conv2d_35 (Conv2D) (None, 12, 12, 128) 114688 ['activation_34[0][0]']
batch_normalization_36 (Ba (None, 12, 12, 128) 384 ['conv2d_35[0][0]']
tchNormalization)
activation_35 (Activation) (None, 12, 12, 128) 0 ['batch_normalization_36[0][0]
']
conv2d_31 (Conv2D) (None, 12, 12, 128) 98304 ['mixed3[0][0]']
conv2d_36 (Conv2D) (None, 12, 12, 128) 114688 ['activation_35[0][0]']
batch_normalization_32 (Ba (None, 12, 12, 128) 384 ['conv2d_31[0][0]']
tchNormalization)
batch_normalization_37 (Ba (None, 12, 12, 128) 384 ['conv2d_36[0][0]']
tchNormalization)
activation_31 (Activation) (None, 12, 12, 128) 0 ['batch_normalization_32[0][0]
']
activation_36 (Activation) (None, 12, 12, 128) 0 ['batch_normalization_37[0][0]
']
conv2d_32 (Conv2D) (None, 12, 12, 128) 114688 ['activation_31[0][0]']
conv2d_37 (Conv2D) (None, 12, 12, 128) 114688 ['activation_36[0][0]']
batch_normalization_33 (Ba (None, 12, 12, 128) 384 ['conv2d_32[0][0]']
tchNormalization)
batch_normalization_38 (Ba (None, 12, 12, 128) 384 ['conv2d_37[0][0]']
tchNormalization)
activation_32 (Activation) (None, 12, 12, 128) 0 ['batch_normalization_33[0][0]
']
activation_37 (Activation) (None, 12, 12, 128) 0 ['batch_normalization_38[0][0]
']
average_pooling2d_3 (Avera (None, 12, 12, 768) 0 ['mixed3[0][0]']
gePooling2D)
conv2d_30 (Conv2D) (None, 12, 12, 192) 147456 ['mixed3[0][0]']
conv2d_33 (Conv2D) (None, 12, 12, 192) 172032 ['activation_32[0][0]']
conv2d_38 (Conv2D) (None, 12, 12, 192) 172032 ['activation_37[0][0]']
conv2d_39 (Conv2D) (None, 12, 12, 192) 147456 ['average_pooling2d_3[0][0]']
batch_normalization_31 (Ba (None, 12, 12, 192) 576 ['conv2d_30[0][0]']
tchNormalization)
batch_normalization_34 (Ba (None, 12, 12, 192) 576 ['conv2d_33[0][0]']
tchNormalization)
batch_normalization_39 (Ba (None, 12, 12, 192) 576 ['conv2d_38[0][0]']
tchNormalization)
batch_normalization_40 (Ba (None, 12, 12, 192) 576 ['conv2d_39[0][0]']
tchNormalization)
activation_30 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_31[0][0]
']
activation_33 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_34[0][0]
']
activation_38 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_39[0][0]
']
activation_39 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_40[0][0]
']
mixed4 (Concatenate) (None, 12, 12, 768) 0 ['activation_30[0][0]',
'activation_33[0][0]',
'activation_38[0][0]',
'activation_39[0][0]']
conv2d_44 (Conv2D) (None, 12, 12, 160) 122880 ['mixed4[0][0]']
batch_normalization_45 (Ba (None, 12, 12, 160) 480 ['conv2d_44[0][0]']
tchNormalization)
activation_44 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_45[0][0]
']
conv2d_45 (Conv2D) (None, 12, 12, 160) 179200 ['activation_44[0][0]']
batch_normalization_46 (Ba (None, 12, 12, 160) 480 ['conv2d_45[0][0]']
tchNormalization)
activation_45 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_46[0][0]
']
conv2d_41 (Conv2D) (None, 12, 12, 160) 122880 ['mixed4[0][0]']
conv2d_46 (Conv2D) (None, 12, 12, 160) 179200 ['activation_45[0][0]']
batch_normalization_42 (Ba (None, 12, 12, 160) 480 ['conv2d_41[0][0]']
tchNormalization)
batch_normalization_47 (Ba (None, 12, 12, 160) 480 ['conv2d_46[0][0]']
tchNormalization)
activation_41 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_42[0][0]
']
activation_46 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_47[0][0]
']
conv2d_42 (Conv2D) (None, 12, 12, 160) 179200 ['activation_41[0][0]']
conv2d_47 (Conv2D) (None, 12, 12, 160) 179200 ['activation_46[0][0]']
batch_normalization_43 (Ba (None, 12, 12, 160) 480 ['conv2d_42[0][0]']
tchNormalization)
batch_normalization_48 (Ba (None, 12, 12, 160) 480 ['conv2d_47[0][0]']
tchNormalization)
activation_42 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_43[0][0]
']
activation_47 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_48[0][0]
']
average_pooling2d_4 (Avera (None, 12, 12, 768) 0 ['mixed4[0][0]']
gePooling2D)
conv2d_40 (Conv2D) (None, 12, 12, 192) 147456 ['mixed4[0][0]']
conv2d_43 (Conv2D) (None, 12, 12, 192) 215040 ['activation_42[0][0]']
conv2d_48 (Conv2D) (None, 12, 12, 192) 215040 ['activation_47[0][0]']
conv2d_49 (Conv2D) (None, 12, 12, 192) 147456 ['average_pooling2d_4[0][0]']
batch_normalization_41 (Ba (None, 12, 12, 192) 576 ['conv2d_40[0][0]']
tchNormalization)
batch_normalization_44 (Ba (None, 12, 12, 192) 576 ['conv2d_43[0][0]']
tchNormalization)
batch_normalization_49 (Ba (None, 12, 12, 192) 576 ['conv2d_48[0][0]']
tchNormalization)
batch_normalization_50 (Ba (None, 12, 12, 192) 576 ['conv2d_49[0][0]']
tchNormalization)
activation_40 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_41[0][0]
']
activation_43 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_44[0][0]
']
activation_48 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_49[0][0]
']
activation_49 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_50[0][0]
']
mixed5 (Concatenate) (None, 12, 12, 768) 0 ['activation_40[0][0]',
'activation_43[0][0]',
'activation_48[0][0]',
'activation_49[0][0]']
conv2d_54 (Conv2D) (None, 12, 12, 160) 122880 ['mixed5[0][0]']
batch_normalization_55 (Ba (None, 12, 12, 160) 480 ['conv2d_54[0][0]']
tchNormalization)
activation_54 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_55[0][0]
']
conv2d_55 (Conv2D) (None, 12, 12, 160) 179200 ['activation_54[0][0]']
batch_normalization_56 (Ba (None, 12, 12, 160) 480 ['conv2d_55[0][0]']
tchNormalization)
activation_55 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_56[0][0]
']
conv2d_51 (Conv2D) (None, 12, 12, 160) 122880 ['mixed5[0][0]']
conv2d_56 (Conv2D) (None, 12, 12, 160) 179200 ['activation_55[0][0]']
batch_normalization_52 (Ba (None, 12, 12, 160) 480 ['conv2d_51[0][0]']
tchNormalization)
batch_normalization_57 (Ba (None, 12, 12, 160) 480 ['conv2d_56[0][0]']
tchNormalization)
activation_51 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_52[0][0]
']
activation_56 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_57[0][0]
']
conv2d_52 (Conv2D) (None, 12, 12, 160) 179200 ['activation_51[0][0]']
conv2d_57 (Conv2D) (None, 12, 12, 160) 179200 ['activation_56[0][0]']
batch_normalization_53 (Ba (None, 12, 12, 160) 480 ['conv2d_52[0][0]']
tchNormalization)
batch_normalization_58 (Ba (None, 12, 12, 160) 480 ['conv2d_57[0][0]']
tchNormalization)
activation_52 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_53[0][0]
']
activation_57 (Activation) (None, 12, 12, 160) 0 ['batch_normalization_58[0][0]
']
average_pooling2d_5 (Avera (None, 12, 12, 768) 0 ['mixed5[0][0]']
gePooling2D)
conv2d_50 (Conv2D) (None, 12, 12, 192) 147456 ['mixed5[0][0]']
conv2d_53 (Conv2D) (None, 12, 12, 192) 215040 ['activation_52[0][0]']
conv2d_58 (Conv2D) (None, 12, 12, 192) 215040 ['activation_57[0][0]']
conv2d_59 (Conv2D) (None, 12, 12, 192) 147456 ['average_pooling2d_5[0][0]']
batch_normalization_51 (Ba (None, 12, 12, 192) 576 ['conv2d_50[0][0]']
tchNormalization)
batch_normalization_54 (Ba (None, 12, 12, 192) 576 ['conv2d_53[0][0]']
tchNormalization)
batch_normalization_59 (Ba (None, 12, 12, 192) 576 ['conv2d_58[0][0]']
tchNormalization)
batch_normalization_60 (Ba (None, 12, 12, 192) 576 ['conv2d_59[0][0]']
tchNormalization)
activation_50 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_51[0][0]
']
activation_53 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_54[0][0]
']
activation_58 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_59[0][0]
']
activation_59 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_60[0][0]
']
mixed6 (Concatenate) (None, 12, 12, 768) 0 ['activation_50[0][0]',
'activation_53[0][0]',
'activation_58[0][0]',
'activation_59[0][0]']
conv2d_64 (Conv2D) (None, 12, 12, 192) 147456 ['mixed6[0][0]']
batch_normalization_65 (Ba (None, 12, 12, 192) 576 ['conv2d_64[0][0]']
tchNormalization)
activation_64 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_65[0][0]
']
conv2d_65 (Conv2D) (None, 12, 12, 192) 258048 ['activation_64[0][0]']
batch_normalization_66 (Ba (None, 12, 12, 192) 576 ['conv2d_65[0][0]']
tchNormalization)
activation_65 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_66[0][0]
']
conv2d_61 (Conv2D) (None, 12, 12, 192) 147456 ['mixed6[0][0]']
conv2d_66 (Conv2D) (None, 12, 12, 192) 258048 ['activation_65[0][0]']
batch_normalization_62 (Ba (None, 12, 12, 192) 576 ['conv2d_61[0][0]']
tchNormalization)
batch_normalization_67 (Ba (None, 12, 12, 192) 576 ['conv2d_66[0][0]']
tchNormalization)
activation_61 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_62[0][0]
']
activation_66 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_67[0][0]
']
conv2d_62 (Conv2D) (None, 12, 12, 192) 258048 ['activation_61[0][0]']
conv2d_67 (Conv2D) (None, 12, 12, 192) 258048 ['activation_66[0][0]']
batch_normalization_63 (Ba (None, 12, 12, 192) 576 ['conv2d_62[0][0]']
tchNormalization)
batch_normalization_68 (Ba (None, 12, 12, 192) 576 ['conv2d_67[0][0]']
tchNormalization)
activation_62 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_63[0][0]
']
activation_67 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_68[0][0]
']
average_pooling2d_6 (Avera (None, 12, 12, 768) 0 ['mixed6[0][0]']
gePooling2D)
conv2d_60 (Conv2D) (None, 12, 12, 192) 147456 ['mixed6[0][0]']
conv2d_63 (Conv2D) (None, 12, 12, 192) 258048 ['activation_62[0][0]']
conv2d_68 (Conv2D) (None, 12, 12, 192) 258048 ['activation_67[0][0]']
conv2d_69 (Conv2D) (None, 12, 12, 192) 147456 ['average_pooling2d_6[0][0]']
batch_normalization_61 (Ba (None, 12, 12, 192) 576 ['conv2d_60[0][0]']
tchNormalization)
batch_normalization_64 (Ba (None, 12, 12, 192) 576 ['conv2d_63[0][0]']
tchNormalization)
batch_normalization_69 (Ba (None, 12, 12, 192) 576 ['conv2d_68[0][0]']
tchNormalization)
batch_normalization_70 (Ba (None, 12, 12, 192) 576 ['conv2d_69[0][0]']
tchNormalization)
activation_60 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_61[0][0]
']
activation_63 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_64[0][0]
']
activation_68 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_69[0][0]
']
activation_69 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_70[0][0]
']
mixed7 (Concatenate) (None, 12, 12, 768) 0 ['activation_60[0][0]',
'activation_63[0][0]',
'activation_68[0][0]',
'activation_69[0][0]']
conv2d_72 (Conv2D) (None, 12, 12, 192) 147456 ['mixed7[0][0]']
batch_normalization_73 (Ba (None, 12, 12, 192) 576 ['conv2d_72[0][0]']
tchNormalization)
activation_72 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_73[0][0]
']
conv2d_73 (Conv2D) (None, 12, 12, 192) 258048 ['activation_72[0][0]']
batch_normalization_74 (Ba (None, 12, 12, 192) 576 ['conv2d_73[0][0]']
tchNormalization)
activation_73 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_74[0][0]
']
conv2d_70 (Conv2D) (None, 12, 12, 192) 147456 ['mixed7[0][0]']
conv2d_74 (Conv2D) (None, 12, 12, 192) 258048 ['activation_73[0][0]']
batch_normalization_71 (Ba (None, 12, 12, 192) 576 ['conv2d_70[0][0]']
tchNormalization)
batch_normalization_75 (Ba (None, 12, 12, 192) 576 ['conv2d_74[0][0]']
tchNormalization)
activation_70 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_71[0][0]
']
activation_74 (Activation) (None, 12, 12, 192) 0 ['batch_normalization_75[0][0]
']
conv2d_71 (Conv2D) (None, 5, 5, 320) 552960 ['activation_70[0][0]']
conv2d_75 (Conv2D) (None, 5, 5, 192) 331776 ['activation_74[0][0]']
batch_normalization_72 (Ba (None, 5, 5, 320) 960 ['conv2d_71[0][0]']
tchNormalization)
batch_normalization_76 (Ba (None, 5, 5, 192) 576 ['conv2d_75[0][0]']
tchNormalization)
activation_71 (Activation) (None, 5, 5, 320) 0 ['batch_normalization_72[0][0]
']
activation_75 (Activation) (None, 5, 5, 192) 0 ['batch_normalization_76[0][0]
']
max_pooling2d_3 (MaxPoolin (None, 5, 5, 768) 0 ['mixed7[0][0]']
g2D)
mixed8 (Concatenate) (None, 5, 5, 1280) 0 ['activation_71[0][0]',
'activation_75[0][0]',
'max_pooling2d_3[0][0]']
conv2d_80 (Conv2D) (None, 5, 5, 448) 573440 ['mixed8[0][0]']
batch_normalization_81 (Ba (None, 5, 5, 448) 1344 ['conv2d_80[0][0]']
tchNormalization)
activation_80 (Activation) (None, 5, 5, 448) 0 ['batch_normalization_81[0][0]
']
conv2d_77 (Conv2D) (None, 5, 5, 384) 491520 ['mixed8[0][0]']
conv2d_81 (Conv2D) (None, 5, 5, 384) 1548288 ['activation_80[0][0]']
batch_normalization_78 (Ba (None, 5, 5, 384) 1152 ['conv2d_77[0][0]']
tchNormalization)
batch_normalization_82 (Ba (None, 5, 5, 384) 1152 ['conv2d_81[0][0]']
tchNormalization)
activation_77 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_78[0][0]
']
activation_81 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_82[0][0]
']
conv2d_78 (Conv2D) (None, 5, 5, 384) 442368 ['activation_77[0][0]']
conv2d_79 (Conv2D) (None, 5, 5, 384) 442368 ['activation_77[0][0]']
conv2d_82 (Conv2D) (None, 5, 5, 384) 442368 ['activation_81[0][0]']
conv2d_83 (Conv2D) (None, 5, 5, 384) 442368 ['activation_81[0][0]']
average_pooling2d_7 (Avera (None, 5, 5, 1280) 0 ['mixed8[0][0]']
gePooling2D)
conv2d_76 (Conv2D) (None, 5, 5, 320) 409600 ['mixed8[0][0]']
batch_normalization_79 (Ba (None, 5, 5, 384) 1152 ['conv2d_78[0][0]']
tchNormalization)
batch_normalization_80 (Ba (None, 5, 5, 384) 1152 ['conv2d_79[0][0]']
tchNormalization)
batch_normalization_83 (Ba (None, 5, 5, 384) 1152 ['conv2d_82[0][0]']
tchNormalization)
batch_normalization_84 (Ba (None, 5, 5, 384) 1152 ['conv2d_83[0][0]']
tchNormalization)
conv2d_84 (Conv2D) (None, 5, 5, 192) 245760 ['average_pooling2d_7[0][0]']
batch_normalization_77 (Ba (None, 5, 5, 320) 960 ['conv2d_76[0][0]']
tchNormalization)
activation_78 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_79[0][0]
']
activation_79 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_80[0][0]
']
activation_82 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_83[0][0]
']
activation_83 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_84[0][0]
']
batch_normalization_85 (Ba (None, 5, 5, 192) 576 ['conv2d_84[0][0]']
tchNormalization)
activation_76 (Activation) (None, 5, 5, 320) 0 ['batch_normalization_77[0][0]
']
mixed9_0 (Concatenate) (None, 5, 5, 768) 0 ['activation_78[0][0]',
'activation_79[0][0]']
concatenate (Concatenate) (None, 5, 5, 768) 0 ['activation_82[0][0]',
'activation_83[0][0]']
activation_84 (Activation) (None, 5, 5, 192) 0 ['batch_normalization_85[0][0]
']
mixed9 (Concatenate) (None, 5, 5, 2048) 0 ['activation_76[0][0]',
'mixed9_0[0][0]',
'concatenate[0][0]',
'activation_84[0][0]']
conv2d_89 (Conv2D) (None, 5, 5, 448) 917504 ['mixed9[0][0]']
batch_normalization_90 (Ba (None, 5, 5, 448) 1344 ['conv2d_89[0][0]']
tchNormalization)
activation_89 (Activation) (None, 5, 5, 448) 0 ['batch_normalization_90[0][0]
']
conv2d_86 (Conv2D) (None, 5, 5, 384) 786432 ['mixed9[0][0]']
conv2d_90 (Conv2D) (None, 5, 5, 384) 1548288 ['activation_89[0][0]']
batch_normalization_87 (Ba (None, 5, 5, 384) 1152 ['conv2d_86[0][0]']
tchNormalization)
batch_normalization_91 (Ba (None, 5, 5, 384) 1152 ['conv2d_90[0][0]']
tchNormalization)
activation_86 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_87[0][0]
']
activation_90 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_91[0][0]
']
conv2d_87 (Conv2D) (None, 5, 5, 384) 442368 ['activation_86[0][0]']
conv2d_88 (Conv2D) (None, 5, 5, 384) 442368 ['activation_86[0][0]']
conv2d_91 (Conv2D) (None, 5, 5, 384) 442368 ['activation_90[0][0]']
conv2d_92 (Conv2D) (None, 5, 5, 384) 442368 ['activation_90[0][0]']
average_pooling2d_8 (Avera (None, 5, 5, 2048) 0 ['mixed9[0][0]']
gePooling2D)
conv2d_85 (Conv2D) (None, 5, 5, 320) 655360 ['mixed9[0][0]']
batch_normalization_88 (Ba (None, 5, 5, 384) 1152 ['conv2d_87[0][0]']
tchNormalization)
batch_normalization_89 (Ba (None, 5, 5, 384) 1152 ['conv2d_88[0][0]']
tchNormalization)
batch_normalization_92 (Ba (None, 5, 5, 384) 1152 ['conv2d_91[0][0]']
tchNormalization)
batch_normalization_93 (Ba (None, 5, 5, 384) 1152 ['conv2d_92[0][0]']
tchNormalization)
conv2d_93 (Conv2D) (None, 5, 5, 192) 393216 ['average_pooling2d_8[0][0]']
batch_normalization_86 (Ba (None, 5, 5, 320) 960 ['conv2d_85[0][0]']
tchNormalization)
activation_87 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_88[0][0]
']
activation_88 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_89[0][0]
']
activation_91 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_92[0][0]
']
activation_92 (Activation) (None, 5, 5, 384) 0 ['batch_normalization_93[0][0]
']
batch_normalization_94 (Ba (None, 5, 5, 192) 576 ['conv2d_93[0][0]']
tchNormalization)
activation_85 (Activation) (None, 5, 5, 320) 0 ['batch_normalization_86[0][0]
']
mixed9_1 (Concatenate) (None, 5, 5, 768) 0 ['activation_87[0][0]',
'activation_88[0][0]']
concatenate_1 (Concatenate (None, 5, 5, 768) 0 ['activation_91[0][0]',
) 'activation_92[0][0]']
activation_93 (Activation) (None, 5, 5, 192) 0 ['batch_normalization_94[0][0]
']
mixed10 (Concatenate) (None, 5, 5, 2048) 0 ['activation_85[0][0]',
'mixed9_1[0][0]',
'concatenate_1[0][0]',
'activation_93[0][0]']
global_average_pooling2d_1 (None, 2048) 0 ['mixed10[0][0]']
(GlobalAveragePooling2D)
dense_2 (Dense) (None, 1024) 2098176 ['global_average_pooling2d_1[0
][0]']
dense_3 (Dense) (None, 196) 200900 ['dense_2[0][0]']
==================================================================================================
Total params: 24101860 (91.94 MB)
Trainable params: 2299076 (8.77 MB)
Non-trainable params: 21802784 (83.17 MB)
__________________________________________________________________________________________________
# Convert to numpy arrays
# Ensure all images have the same shape before stacking
# Ensure the model summary is called after defining the model
googlenet_batch_size=16
history_googlenet= googlenet_model.fit(
train_generator, # Uses batches from the generator
steps_per_epoch=len(df_train) // googlenet_batch_size, # Number of batches per epoch
epochs=10,
validation_data=val_generator, # Uses batches from the validation generator
validation_steps=len(df_val) // googlenet_batch_size, # Number of validation batches per epoch
)
Epoch 1/10 407/407 [==============================] - 52s 85ms/step - loss: 4.4418 - accuracy: 0.0608 - val_loss: 3.8075 - val_accuracy: 0.1159 Epoch 2/10 407/407 [==============================] - 26s 65ms/step - loss: 3.4557 - accuracy: 0.1528 - val_loss: 3.5462 - val_accuracy: 0.1407 Epoch 3/10 407/407 [==============================] - 26s 64ms/step - loss: 3.0098 - accuracy: 0.2385 - val_loss: 3.5126 - val_accuracy: 0.1525 Epoch 4/10 407/407 [==============================] - 26s 64ms/step - loss: 2.6687 - accuracy: 0.3068 - val_loss: 3.4534 - val_accuracy: 0.1742 Epoch 5/10 407/407 [==============================] - 26s 63ms/step - loss: 2.3860 - accuracy: 0.3670 - val_loss: 3.4797 - val_accuracy: 0.1897 Epoch 6/10 407/407 [==============================] - 25s 61ms/step - loss: 2.1411 - accuracy: 0.4302 - val_loss: 3.7777 - val_accuracy: 0.1866 Epoch 7/10 407/407 [==============================] - 23s 57ms/step - loss: 1.9209 - accuracy: 0.4859 - val_loss: 3.5793 - val_accuracy: 0.2083 Epoch 8/10 407/407 [==============================] - 21s 53ms/step - loss: 1.7347 - accuracy: 0.5316 - val_loss: 3.7177 - val_accuracy: 0.2139 Epoch 9/10 407/407 [==============================] - 21s 53ms/step - loss: 1.5555 - accuracy: 0.5787 - val_loss: 3.8429 - val_accuracy: 0.2188 Epoch 10/10 407/407 [==============================] - 21s 52ms/step - loss: 1.4043 - accuracy: 0.6176 - val_loss: 3.9853 - val_accuracy: 0.2294
#display model accuracy vs model loss
plot_training_history(history_googlenet)
y_pred, y_true,df_googlenet_classification_report = generate_classification_report_tf_model(
model=googlenet_model,
df_val=df_val,
label_encoder=label_encoder,
preprocess_fn=googlenet_preprocess,
batch_size=32,
report_name="googlenet_classification_report.csv"
)
51/51 [==============================] - 10s 43ms/step
Model Accuracy: 0.2290
Classification Report:
Report saved as: googlenet_classification_report.csv
Model Accuracy: 0.2290
Average Summary Metrics:
precision recall f1-score
macro avg 0.404701 0.227161 0.205622
weighted avg 0.423114 0.228975 0.217884
overall_accuracy 0.228975 NaN NaN
Displaying top 10 of googlenet in confusion matrix
df_support = df_googlenet_classification_report.iloc[:-3] # exclude average rows
top_10_classes = df_support.sort_values("support", ascending=False).head(10).index.tolist()
top_10_indices = [np.where(label_encoder.classes_ == cls)[0][0] for cls in top_10_classes]
googlenet_cm = confusion_matrix(y_true, y_pred)
googlenet_cm_top10 = googlenet_cm[np.ix_(top_10_indices, top_10_indices)]
plt.figure(figsize=(10, 8))
sns.heatmap(googlenet_cm_top10, annot=True, fmt='d',
xticklabels=top_10_classes, yticklabels=top_10_classes,
cmap='Blues')
plt.title("GoogleNet Confusion Matrix (Top 10 Classes)")
plt.xlabel("Predicted")
plt.ylabel("True")
plt.tight_layout()
plt.show()
The inception modules allow the model to learn features at different scales, which can be beneficial for detecting cars of various sizes and orientations.
Training Accuracy: Steadily increases, reaching ~75%. Validation Accuracy: Stagnates around 20-25%, indicating poor generalization. Training Loss: Decreases smoothly, showing effective learning on training data.
Validation Loss: Plateaus and increases after a few epochs, a sign of overfitting.
Classification Report Analysis: Overall Accuracy: 25%, indicating poor performance on validation data. Precision: ~41% Recall: ~26% (Very Low) F1-Score: ~24%
Key Issues Identified: Indicates skewed performance, likely due to class imbalance. Overfitting. High Bias (Poor Performance on Validation Data) Potential Class Imbalance
6C. AlexNet
# Define paths
#image_dir = 'Car_Images/Car Images/Test Images' # Adjust based on your directory structure
image_dir = 'car_data/car_data/test'
# Prepare data
images = []
labels = []
for index, row in test_annotations_df.iterrows():
#image_name = row['Image Name']
image_name = row['image_name']
# Load and preprocess the image
image = cv2.imread(image_path)
image = cv2.resize(image, (227, 227)) # Resize to 227x227 pixels (AlexNet input size)
images.append(image)
# Assuming 'Image class' contains the class label
#labels.append(row['Image class'])
labels.append(row['image_class'])
# Convert to numpy arrays
images = np.array(images)
labels = np.array(labels)
# Encode labels
unique_classes = np.unique(labels)
def create_alexnet_model(input_shape, num_classes):
model = Sequential()
# First Convolutional Layer
model.add(Conv2D(96, (11, 11), strides=(4, 4), activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))
model.add(BatchNormalization())
# Second Convolutional Layer
model.add(Conv2D(256, (5, 5), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))
model.add(BatchNormalization())
# Third Convolutional Layer
model.add(Conv2D(384, (3, 3), padding='same', activation='relu'))
# Fourth Convolutional Layer
model.add(Conv2D(384, (3, 3), padding='same', activation='relu'))
# Fifth Convolutional Layer
model.add(Conv2D(256, (3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))
model.add(BatchNormalization())
# Flatten the output
model.add(Flatten())
# Fully Connected Layers
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))
return model
# Create the model
input_shape = (224, 224, 3) # Image dimensions for AlexNet
#num_classes = len(unique_classes)
num_classes = len(df_training['labels'].unique())
model = create_alexnet_model(input_shape, num_classes)
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
# Data augmentation
datagen = ImageDataGenerator(rotation_range=20, width_shift_range=0.2,
height_shift_range=0.2, shear_range=0.2,
zoom_range=0.2, horizontal_flip=True,
fill_mode='nearest')
epochs=10
#batch_size=32
batch_size=16
train_steps = len(df_train) // batch_size
val_steps = len(df_val) // batch_size
model.summary()
alexnet_history = model.fit(
train_generator,
steps_per_epoch = train_steps,
epochs=epochs,
batch_size=32,
validation_data=val_generator,
validation_steps=val_steps
)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_94 (Conv2D) (None, 54, 54, 96) 34944
max_pooling2d_4 (MaxPoolin (None, 26, 26, 96) 0
g2D)
batch_normalization_95 (Ba (None, 26, 26, 96) 384
tchNormalization)
conv2d_95 (Conv2D) (None, 26, 26, 256) 614656
max_pooling2d_5 (MaxPoolin (None, 12, 12, 256) 0
g2D)
batch_normalization_96 (Ba (None, 12, 12, 256) 1024
tchNormalization)
conv2d_96 (Conv2D) (None, 12, 12, 384) 885120
conv2d_97 (Conv2D) (None, 12, 12, 384) 1327488
conv2d_98 (Conv2D) (None, 12, 12, 256) 884992
max_pooling2d_6 (MaxPoolin (None, 5, 5, 256) 0
g2D)
batch_normalization_97 (Ba (None, 5, 5, 256) 1024
tchNormalization)
flatten (Flatten) (None, 6400) 0
dense_4 (Dense) (None, 4096) 26218496
dropout_1 (Dropout) (None, 4096) 0
dense_5 (Dense) (None, 4096) 16781312
dropout_2 (Dropout) (None, 4096) 0
dense_6 (Dense) (None, 196) 803012
=================================================================
Total params: 47552452 (181.40 MB)
Trainable params: 47551236 (181.39 MB)
Non-trainable params: 1216 (4.75 KB)
_________________________________________________________________
Epoch 1/10
407/407 [==============================] - 32s 67ms/step - loss: 5.5267 - accuracy: 0.0048 - val_loss: 5.2808 - val_accuracy: 0.0025
Epoch 2/10
407/407 [==============================] - 26s 64ms/step - loss: 5.2781 - accuracy: 0.0062 - val_loss: 5.2852 - val_accuracy: 0.0087
Epoch 3/10
407/407 [==============================] - 26s 63ms/step - loss: 5.2782 - accuracy: 0.0077 - val_loss: 5.2892 - val_accuracy: 0.0087
Epoch 4/10
407/407 [==============================] - 25s 62ms/step - loss: 5.2753 - accuracy: 0.0080 - val_loss: 5.2932 - val_accuracy: 0.0087
Epoch 5/10
407/407 [==============================] - 24s 60ms/step - loss: 5.3506 - accuracy: 0.0078 - val_loss: 69.7741 - val_accuracy: 0.0050
Epoch 6/10
407/407 [==============================] - 23s 57ms/step - loss: 5.5419 - accuracy: 0.0078 - val_loss: 5.2935 - val_accuracy: 0.0087
Epoch 7/10
407/407 [==============================] - 21s 51ms/step - loss: 5.2754 - accuracy: 0.0083 - val_loss: 5.2952 - val_accuracy: 0.0087
Epoch 8/10
407/407 [==============================] - 21s 51ms/step - loss: 5.2740 - accuracy: 0.0083 - val_loss: 5.2966 - val_accuracy: 0.0087
Epoch 9/10
407/407 [==============================] - 21s 52ms/step - loss: 5.2731 - accuracy: 0.0083 - val_loss: 5.2985 - val_accuracy: 0.0087
Epoch 10/10
407/407 [==============================] - 21s 51ms/step - loss: 5.2742 - accuracy: 0.0083 - val_loss: 5.3001 - val_accuracy: 0.0087
#display model accuracy vs loss
plot_training_history(alexnet_history)
X_val = np.array([img for img in df_val['image']])
y_val_true = np.array([np.argmax(label) for label in df_val['label_categorical']])
# Predict in one go
y_val_pred = np.argmax(model.predict(X_val), axis=1)
51/51 [==============================] - 1s 11ms/step
alexnet_report = classification_report(
y_val_true,
y_val_pred,
target_names=label_encoder.classes_,
output_dict=True,
zero_division=1
)
acc = accuracy_score(y_val_true, y_val_pred)
df_alexnet_classification_report = pd.DataFrame(alexnet_report).transpose()
df_alexnet_classification_report.loc["overall_accuracy"] = [acc, None, None, None]
df_alexnet_classification_report.to_csv("alexnet_classification_report_vectorized.csv")
print(f"Accuracy Score: {acc:.4f}")
print("Average Summary Metrics:")
print(df_alexnet_classification_report.tail(3)[["precision", "recall", "f1-score"]])
Accuracy Score: 0.0086
Average Summary Metrics:
precision recall f1-score
macro avg 0.994942 0.005102 0.000087
weighted avg 0.991480 0.008594 0.000146
overall_accuracy 0.008594 NaN NaN
# Print Classification Report
#print("Classification Report:")
#print(classification_report(y_val_true, y_val_pred, target_names=df_training['labels'].unique(), zero_division=0))
confusion metrics
df_support = df_alexnet_classification_report.iloc[:-3] # exclude average rows
top_10_classes = df_support.sort_values("support", ascending=False).head(10).index.tolist()
top_10_indices = [np.where(label_encoder.classes_ == cls)[0][0] for cls in top_10_classes]
alexnet_cm = confusion_matrix(y_val_true, y_val_pred)
alexnet_cm_top10 = alexnet_cm[np.ix_(top_10_indices, top_10_indices)]
plt.figure(figsize=(10, 8))
sns.heatmap(alexnet_cm_top10, annot=True, fmt='d',
xticklabels=top_10_classes, yticklabels=top_10_classes,
cmap='Blues')
plt.title("AlexNet Confusion Matrix (Top 10 Classes)")
plt.xlabel("Predicted")
plt.ylabel("True")
plt.tight_layout()
plt.show()
Further Actions that can be taken are
6D. ResNet
# Load ResNet50 base model without the top layer
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
# Freeze base model layers
for layer in base_model.layers:
layer.trainable = False
# Add custom classification layers
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(len(df_training['labels_encoded'].unique()), activation='softmax')(x) # Output layer
# Define model
resnet_model = Model(inputs=base_model.input, outputs=x)
# Compile model
resnet_model.compile(optimizer=Adam(learning_rate=0.0001), loss='categorical_crossentropy', metrics=['accuracy'])
# Print model summary
resnet_model.summary()
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5
94765736/94765736 [==============================] - 7s 0us/step
Model: "model_2"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_3 (InputLayer) [(None, 224, 224, 3)] 0 []
conv1_pad (ZeroPadding2D) (None, 230, 230, 3) 0 ['input_3[0][0]']
conv1_conv (Conv2D) (None, 112, 112, 64) 9472 ['conv1_pad[0][0]']
conv1_bn (BatchNormalizati (None, 112, 112, 64) 256 ['conv1_conv[0][0]']
on)
conv1_relu (Activation) (None, 112, 112, 64) 0 ['conv1_bn[0][0]']
pool1_pad (ZeroPadding2D) (None, 114, 114, 64) 0 ['conv1_relu[0][0]']
pool1_pool (MaxPooling2D) (None, 56, 56, 64) 0 ['pool1_pad[0][0]']
conv2_block1_1_conv (Conv2 (None, 56, 56, 64) 4160 ['pool1_pool[0][0]']
D)
conv2_block1_1_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block1_1_conv[0][0]']
rmalization)
conv2_block1_1_relu (Activ (None, 56, 56, 64) 0 ['conv2_block1_1_bn[0][0]']
ation)
conv2_block1_2_conv (Conv2 (None, 56, 56, 64) 36928 ['conv2_block1_1_relu[0][0]']
D)
conv2_block1_2_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block1_2_conv[0][0]']
rmalization)
conv2_block1_2_relu (Activ (None, 56, 56, 64) 0 ['conv2_block1_2_bn[0][0]']
ation)
conv2_block1_0_conv (Conv2 (None, 56, 56, 256) 16640 ['pool1_pool[0][0]']
D)
conv2_block1_3_conv (Conv2 (None, 56, 56, 256) 16640 ['conv2_block1_2_relu[0][0]']
D)
conv2_block1_0_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block1_0_conv[0][0]']
rmalization)
conv2_block1_3_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block1_3_conv[0][0]']
rmalization)
conv2_block1_add (Add) (None, 56, 56, 256) 0 ['conv2_block1_0_bn[0][0]',
'conv2_block1_3_bn[0][0]']
conv2_block1_out (Activati (None, 56, 56, 256) 0 ['conv2_block1_add[0][0]']
on)
conv2_block2_1_conv (Conv2 (None, 56, 56, 64) 16448 ['conv2_block1_out[0][0]']
D)
conv2_block2_1_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block2_1_conv[0][0]']
rmalization)
conv2_block2_1_relu (Activ (None, 56, 56, 64) 0 ['conv2_block2_1_bn[0][0]']
ation)
conv2_block2_2_conv (Conv2 (None, 56, 56, 64) 36928 ['conv2_block2_1_relu[0][0]']
D)
conv2_block2_2_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block2_2_conv[0][0]']
rmalization)
conv2_block2_2_relu (Activ (None, 56, 56, 64) 0 ['conv2_block2_2_bn[0][0]']
ation)
conv2_block2_3_conv (Conv2 (None, 56, 56, 256) 16640 ['conv2_block2_2_relu[0][0]']
D)
conv2_block2_3_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block2_3_conv[0][0]']
rmalization)
conv2_block2_add (Add) (None, 56, 56, 256) 0 ['conv2_block1_out[0][0]',
'conv2_block2_3_bn[0][0]']
conv2_block2_out (Activati (None, 56, 56, 256) 0 ['conv2_block2_add[0][0]']
on)
conv2_block3_1_conv (Conv2 (None, 56, 56, 64) 16448 ['conv2_block2_out[0][0]']
D)
conv2_block3_1_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block3_1_conv[0][0]']
rmalization)
conv2_block3_1_relu (Activ (None, 56, 56, 64) 0 ['conv2_block3_1_bn[0][0]']
ation)
conv2_block3_2_conv (Conv2 (None, 56, 56, 64) 36928 ['conv2_block3_1_relu[0][0]']
D)
conv2_block3_2_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block3_2_conv[0][0]']
rmalization)
conv2_block3_2_relu (Activ (None, 56, 56, 64) 0 ['conv2_block3_2_bn[0][0]']
ation)
conv2_block3_3_conv (Conv2 (None, 56, 56, 256) 16640 ['conv2_block3_2_relu[0][0]']
D)
conv2_block3_3_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block3_3_conv[0][0]']
rmalization)
conv2_block3_add (Add) (None, 56, 56, 256) 0 ['conv2_block2_out[0][0]',
'conv2_block3_3_bn[0][0]']
conv2_block3_out (Activati (None, 56, 56, 256) 0 ['conv2_block3_add[0][0]']
on)
conv3_block1_1_conv (Conv2 (None, 28, 28, 128) 32896 ['conv2_block3_out[0][0]']
D)
conv3_block1_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block1_1_conv[0][0]']
rmalization)
conv3_block1_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block1_1_bn[0][0]']
ation)
conv3_block1_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block1_1_relu[0][0]']
D)
conv3_block1_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block1_2_conv[0][0]']
rmalization)
conv3_block1_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block1_2_bn[0][0]']
ation)
conv3_block1_0_conv (Conv2 (None, 28, 28, 512) 131584 ['conv2_block3_out[0][0]']
D)
conv3_block1_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block1_2_relu[0][0]']
D)
conv3_block1_0_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block1_0_conv[0][0]']
rmalization)
conv3_block1_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block1_3_conv[0][0]']
rmalization)
conv3_block1_add (Add) (None, 28, 28, 512) 0 ['conv3_block1_0_bn[0][0]',
'conv3_block1_3_bn[0][0]']
conv3_block1_out (Activati (None, 28, 28, 512) 0 ['conv3_block1_add[0][0]']
on)
conv3_block2_1_conv (Conv2 (None, 28, 28, 128) 65664 ['conv3_block1_out[0][0]']
D)
conv3_block2_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block2_1_conv[0][0]']
rmalization)
conv3_block2_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block2_1_bn[0][0]']
ation)
conv3_block2_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block2_1_relu[0][0]']
D)
conv3_block2_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block2_2_conv[0][0]']
rmalization)
conv3_block2_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block2_2_bn[0][0]']
ation)
conv3_block2_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block2_2_relu[0][0]']
D)
conv3_block2_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block2_3_conv[0][0]']
rmalization)
conv3_block2_add (Add) (None, 28, 28, 512) 0 ['conv3_block1_out[0][0]',
'conv3_block2_3_bn[0][0]']
conv3_block2_out (Activati (None, 28, 28, 512) 0 ['conv3_block2_add[0][0]']
on)
conv3_block3_1_conv (Conv2 (None, 28, 28, 128) 65664 ['conv3_block2_out[0][0]']
D)
conv3_block3_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block3_1_conv[0][0]']
rmalization)
conv3_block3_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block3_1_bn[0][0]']
ation)
conv3_block3_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block3_1_relu[0][0]']
D)
conv3_block3_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block3_2_conv[0][0]']
rmalization)
conv3_block3_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block3_2_bn[0][0]']
ation)
conv3_block3_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block3_2_relu[0][0]']
D)
conv3_block3_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block3_3_conv[0][0]']
rmalization)
conv3_block3_add (Add) (None, 28, 28, 512) 0 ['conv3_block2_out[0][0]',
'conv3_block3_3_bn[0][0]']
conv3_block3_out (Activati (None, 28, 28, 512) 0 ['conv3_block3_add[0][0]']
on)
conv3_block4_1_conv (Conv2 (None, 28, 28, 128) 65664 ['conv3_block3_out[0][0]']
D)
conv3_block4_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block4_1_conv[0][0]']
rmalization)
conv3_block4_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block4_1_bn[0][0]']
ation)
conv3_block4_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block4_1_relu[0][0]']
D)
conv3_block4_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block4_2_conv[0][0]']
rmalization)
conv3_block4_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block4_2_bn[0][0]']
ation)
conv3_block4_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block4_2_relu[0][0]']
D)
conv3_block4_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block4_3_conv[0][0]']
rmalization)
conv3_block4_add (Add) (None, 28, 28, 512) 0 ['conv3_block3_out[0][0]',
'conv3_block4_3_bn[0][0]']
conv3_block4_out (Activati (None, 28, 28, 512) 0 ['conv3_block4_add[0][0]']
on)
conv4_block1_1_conv (Conv2 (None, 14, 14, 256) 131328 ['conv3_block4_out[0][0]']
D)
conv4_block1_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block1_1_conv[0][0]']
rmalization)
conv4_block1_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block1_1_bn[0][0]']
ation)
conv4_block1_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block1_1_relu[0][0]']
D)
conv4_block1_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block1_2_conv[0][0]']
rmalization)
conv4_block1_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block1_2_bn[0][0]']
ation)
conv4_block1_0_conv (Conv2 (None, 14, 14, 1024) 525312 ['conv3_block4_out[0][0]']
D)
conv4_block1_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block1_2_relu[0][0]']
D)
conv4_block1_0_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block1_0_conv[0][0]']
rmalization)
conv4_block1_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block1_3_conv[0][0]']
rmalization)
conv4_block1_add (Add) (None, 14, 14, 1024) 0 ['conv4_block1_0_bn[0][0]',
'conv4_block1_3_bn[0][0]']
conv4_block1_out (Activati (None, 14, 14, 1024) 0 ['conv4_block1_add[0][0]']
on)
conv4_block2_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block1_out[0][0]']
D)
conv4_block2_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block2_1_conv[0][0]']
rmalization)
conv4_block2_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block2_1_bn[0][0]']
ation)
conv4_block2_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block2_1_relu[0][0]']
D)
conv4_block2_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block2_2_conv[0][0]']
rmalization)
conv4_block2_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block2_2_bn[0][0]']
ation)
conv4_block2_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block2_2_relu[0][0]']
D)
conv4_block2_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block2_3_conv[0][0]']
rmalization)
conv4_block2_add (Add) (None, 14, 14, 1024) 0 ['conv4_block1_out[0][0]',
'conv4_block2_3_bn[0][0]']
conv4_block2_out (Activati (None, 14, 14, 1024) 0 ['conv4_block2_add[0][0]']
on)
conv4_block3_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block2_out[0][0]']
D)
conv4_block3_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block3_1_conv[0][0]']
rmalization)
conv4_block3_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block3_1_bn[0][0]']
ation)
conv4_block3_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block3_1_relu[0][0]']
D)
conv4_block3_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block3_2_conv[0][0]']
rmalization)
conv4_block3_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block3_2_bn[0][0]']
ation)
conv4_block3_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block3_2_relu[0][0]']
D)
conv4_block3_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block3_3_conv[0][0]']
rmalization)
conv4_block3_add (Add) (None, 14, 14, 1024) 0 ['conv4_block2_out[0][0]',
'conv4_block3_3_bn[0][0]']
conv4_block3_out (Activati (None, 14, 14, 1024) 0 ['conv4_block3_add[0][0]']
on)
conv4_block4_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block3_out[0][0]']
D)
conv4_block4_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block4_1_conv[0][0]']
rmalization)
conv4_block4_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block4_1_bn[0][0]']
ation)
conv4_block4_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block4_1_relu[0][0]']
D)
conv4_block4_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block4_2_conv[0][0]']
rmalization)
conv4_block4_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block4_2_bn[0][0]']
ation)
conv4_block4_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block4_2_relu[0][0]']
D)
conv4_block4_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block4_3_conv[0][0]']
rmalization)
conv4_block4_add (Add) (None, 14, 14, 1024) 0 ['conv4_block3_out[0][0]',
'conv4_block4_3_bn[0][0]']
conv4_block4_out (Activati (None, 14, 14, 1024) 0 ['conv4_block4_add[0][0]']
on)
conv4_block5_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block4_out[0][0]']
D)
conv4_block5_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block5_1_conv[0][0]']
rmalization)
conv4_block5_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block5_1_bn[0][0]']
ation)
conv4_block5_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block5_1_relu[0][0]']
D)
conv4_block5_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block5_2_conv[0][0]']
rmalization)
conv4_block5_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block5_2_bn[0][0]']
ation)
conv4_block5_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block5_2_relu[0][0]']
D)
conv4_block5_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block5_3_conv[0][0]']
rmalization)
conv4_block5_add (Add) (None, 14, 14, 1024) 0 ['conv4_block4_out[0][0]',
'conv4_block5_3_bn[0][0]']
conv4_block5_out (Activati (None, 14, 14, 1024) 0 ['conv4_block5_add[0][0]']
on)
conv4_block6_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block5_out[0][0]']
D)
conv4_block6_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block6_1_conv[0][0]']
rmalization)
conv4_block6_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block6_1_bn[0][0]']
ation)
conv4_block6_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block6_1_relu[0][0]']
D)
conv4_block6_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block6_2_conv[0][0]']
rmalization)
conv4_block6_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block6_2_bn[0][0]']
ation)
conv4_block6_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block6_2_relu[0][0]']
D)
conv4_block6_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block6_3_conv[0][0]']
rmalization)
conv4_block6_add (Add) (None, 14, 14, 1024) 0 ['conv4_block5_out[0][0]',
'conv4_block6_3_bn[0][0]']
conv4_block6_out (Activati (None, 14, 14, 1024) 0 ['conv4_block6_add[0][0]']
on)
conv5_block1_1_conv (Conv2 (None, 7, 7, 512) 524800 ['conv4_block6_out[0][0]']
D)
conv5_block1_1_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block1_1_conv[0][0]']
rmalization)
conv5_block1_1_relu (Activ (None, 7, 7, 512) 0 ['conv5_block1_1_bn[0][0]']
ation)
conv5_block1_2_conv (Conv2 (None, 7, 7, 512) 2359808 ['conv5_block1_1_relu[0][0]']
D)
conv5_block1_2_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block1_2_conv[0][0]']
rmalization)
conv5_block1_2_relu (Activ (None, 7, 7, 512) 0 ['conv5_block1_2_bn[0][0]']
ation)
conv5_block1_0_conv (Conv2 (None, 7, 7, 2048) 2099200 ['conv4_block6_out[0][0]']
D)
conv5_block1_3_conv (Conv2 (None, 7, 7, 2048) 1050624 ['conv5_block1_2_relu[0][0]']
D)
conv5_block1_0_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block1_0_conv[0][0]']
rmalization)
conv5_block1_3_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block1_3_conv[0][0]']
rmalization)
conv5_block1_add (Add) (None, 7, 7, 2048) 0 ['conv5_block1_0_bn[0][0]',
'conv5_block1_3_bn[0][0]']
conv5_block1_out (Activati (None, 7, 7, 2048) 0 ['conv5_block1_add[0][0]']
on)
conv5_block2_1_conv (Conv2 (None, 7, 7, 512) 1049088 ['conv5_block1_out[0][0]']
D)
conv5_block2_1_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block2_1_conv[0][0]']
rmalization)
conv5_block2_1_relu (Activ (None, 7, 7, 512) 0 ['conv5_block2_1_bn[0][0]']
ation)
conv5_block2_2_conv (Conv2 (None, 7, 7, 512) 2359808 ['conv5_block2_1_relu[0][0]']
D)
conv5_block2_2_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block2_2_conv[0][0]']
rmalization)
conv5_block2_2_relu (Activ (None, 7, 7, 512) 0 ['conv5_block2_2_bn[0][0]']
ation)
conv5_block2_3_conv (Conv2 (None, 7, 7, 2048) 1050624 ['conv5_block2_2_relu[0][0]']
D)
conv5_block2_3_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block2_3_conv[0][0]']
rmalization)
conv5_block2_add (Add) (None, 7, 7, 2048) 0 ['conv5_block1_out[0][0]',
'conv5_block2_3_bn[0][0]']
conv5_block2_out (Activati (None, 7, 7, 2048) 0 ['conv5_block2_add[0][0]']
on)
conv5_block3_1_conv (Conv2 (None, 7, 7, 512) 1049088 ['conv5_block2_out[0][0]']
D)
conv5_block3_1_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block3_1_conv[0][0]']
rmalization)
conv5_block3_1_relu (Activ (None, 7, 7, 512) 0 ['conv5_block3_1_bn[0][0]']
ation)
conv5_block3_2_conv (Conv2 (None, 7, 7, 512) 2359808 ['conv5_block3_1_relu[0][0]']
D)
conv5_block3_2_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block3_2_conv[0][0]']
rmalization)
conv5_block3_2_relu (Activ (None, 7, 7, 512) 0 ['conv5_block3_2_bn[0][0]']
ation)
conv5_block3_3_conv (Conv2 (None, 7, 7, 2048) 1050624 ['conv5_block3_2_relu[0][0]']
D)
conv5_block3_3_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block3_3_conv[0][0]']
rmalization)
conv5_block3_add (Add) (None, 7, 7, 2048) 0 ['conv5_block2_out[0][0]',
'conv5_block3_3_bn[0][0]']
conv5_block3_out (Activati (None, 7, 7, 2048) 0 ['conv5_block3_add[0][0]']
on)
global_average_pooling2d_2 (None, 2048) 0 ['conv5_block3_out[0][0]']
(GlobalAveragePooling2D)
dense_7 (Dense) (None, 512) 1049088 ['global_average_pooling2d_2[0
][0]']
dropout_3 (Dropout) (None, 512) 0 ['dense_7[0][0]']
dense_8 (Dense) (None, 196) 100548 ['dropout_3[0][0]']
==================================================================================================
Total params: 24737348 (94.37 MB)
Trainable params: 1149636 (4.39 MB)
Non-trainable params: 23587712 (89.98 MB)
__________________________________________________________________________________________________
epochs = 10
batch_size=16
steps_per_epoch = len(df_train) // batch_size
validation_steps = len(df_val) // batch_size
resnet_history = resnet_model.fit(
train_generator,
steps_per_epoch=steps_per_epoch,
validation_data=val_generator,
validation_steps=validation_steps,
epochs=epochs
)
Epoch 1/10 407/407 [==============================] - 40s 75ms/step - loss: 5.3845 - accuracy: 0.0069 - val_loss: 5.2877 - val_accuracy: 0.0093 Epoch 2/10 407/407 [==============================] - 26s 65ms/step - loss: 5.2828 - accuracy: 0.0054 - val_loss: 5.2869 - val_accuracy: 0.0074 Epoch 3/10 407/407 [==============================] - 26s 64ms/step - loss: 5.2766 - accuracy: 0.0066 - val_loss: 5.2850 - val_accuracy: 0.0050 Epoch 4/10 407/407 [==============================] - 26s 63ms/step - loss: 5.2752 - accuracy: 0.0048 - val_loss: 5.2817 - val_accuracy: 0.0074 Epoch 5/10 407/407 [==============================] - 25s 61ms/step - loss: 5.2722 - accuracy: 0.0089 - val_loss: 5.2828 - val_accuracy: 0.0025 Epoch 6/10 407/407 [==============================] - 23s 57ms/step - loss: 5.2694 - accuracy: 0.0063 - val_loss: 5.2840 - val_accuracy: 0.0099 Epoch 7/10 407/407 [==============================] - 21s 52ms/step - loss: 5.2642 - accuracy: 0.0092 - val_loss: 5.2849 - val_accuracy: 0.0037 Epoch 8/10 407/407 [==============================] - 21s 52ms/step - loss: 5.2605 - accuracy: 0.0095 - val_loss: 5.2806 - val_accuracy: 0.0043 Epoch 9/10 407/407 [==============================] - 21s 52ms/step - loss: 5.2546 - accuracy: 0.0085 - val_loss: 5.2800 - val_accuracy: 0.0099 Epoch 10/10 407/407 [==============================] - 21s 52ms/step - loss: 5.2487 - accuracy: 0.0108 - val_loss: 5.2801 - val_accuracy: 0.0056
#accuracy loss graph
plot_training_history(resnet_history)
y_pred, y_true,df_resnet_classification_report = generate_classification_report_tf_model(
model=resnet_model,
df_val=df_val,
label_encoder=label_encoder,
preprocess_fn=resnet_preprocess,
batch_size=32,
report_name="resnet_classification_report.csv"
)
51/51 [==============================] - 2s 34ms/step
Model Accuracy: 0.0055
Classification Report:
Report saved as: resnet_classification_report.csv
Model Accuracy: 0.0055
Average Summary Metrics:
precision recall f1-score
macro avg 0.933837 0.006013 0.000318
weighted avg 0.946754 0.005525 0.000312
overall_accuracy 0.005525 NaN NaN
# Compute confusion matrix
df_support = df_resnet_classification_report.iloc[:-3] # exclude average rows
top_10_classes = df_support.sort_values("support", ascending=False).head(10).index.tolist()
top_10_indices = [np.where(label_encoder.classes_ == cls)[0][0] for cls in top_10_classes]
resnet_cm = confusion_matrix(y_val_true, y_val_pred)
resnet_cm_top10 = resnet_cm[np.ix_(top_10_indices, top_10_indices)]
# Plot confusion matrix
plt.figure(figsize=(10, 8))
sns.heatmap(resnet_cm_top10, annot=True, fmt='d',
xticklabels=top_10_classes, yticklabels=top_10_classes,
cmap='Blues')
#sns.heatmap(cm, annot=True, fmt="d", cmap="Blues", xticklabels=label_encoder.classes_, yticklabels=label_encoder.classes_)
plt.title("ResNet Confusion Matrix (Top 10 Classes)")
plt.xlabel("Predicted Label")
plt.ylabel("True Label")
plt.show()
Observation:
The model is not doing well due to
Further Actions could be
Googlenet and Resnet further in next milestone will undergo hyper parameter tunning as commonly data imbalance and accuracy is less compared to loss
Mobilenet and Alexnet are light weight models/Shallow models, hence they are being dropped from further fine tuning and comparing them with other models.
from tensorflow.keras.mixed_precision import set_global_policy
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint,ReduceLROnPlateau
import random
set_global_policy('mixed_float16')
INFO:tensorflow:Mixed precision compatibility check (mixed_float16): OK Your GPU will likely run quickly with dtype policy mixed_float16 as it has compute capability of at least 7.0. Your GPU: NVIDIA A10G, compute capability 8.6
print("TF Version:", tf.__version__)
print("GPU Available:", tf.config.list_physical_devices('GPU'))
TF Version: 2.16.2 GPU Available: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
unique_classes = df_training['labels'].unique()
base_model = InceptionV3(
weights='imagenet',
include_top=False,
input_shape=(224, 224, 3)
)
for layer in base_model.layers[:-50]:
layer.trainable = False
x = base_model.output
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(512, activation='relu')(x)
x = tf.keras.layers.Dropout(0.5)(x)
output = layers.Dense(196, activation='softmax', dtype='float32')(x) # Force output to float32
googlenet_model_tuned = Model(inputs=base_model.input, outputs=output)
googlenet_model_tuned.compile(
optimizer=Adam(learning_rate=1e-3),
loss='sparse_categorical_crossentropy', # use categorical_crossentropy if labels are one-hot
metrics=['accuracy', tf.keras.metrics.TopKCategoricalAccuracy(k=5)]
)
batch_size = 16
use_augmentation = True
df_split = df_training.drop(columns=['image']).copy()
df_train_googlenet, df_val_googlenet = train_test_split( df_split, test_size=0.2, random_state=42)
train_paths = df_train_googlenet["Image_Path"].values
train_labels = np.array([np.argmax(label) for label in df_train_googlenet["label_categorical"]])
val_paths = df_val_googlenet["Image_Path"].values
val_labels = np.array([np.argmax(label) for label in df_val_googlenet["label_categorical"]])
data_augmentation = tf.keras.Sequential([
layers.Rescaling(1./255),
layers.RandomFlip("horizontal"),
layers.RandomRotation(0.1),
layers.RandomZoom(0.1),
layers.RandomContrast(0.1),
layers.RandomTranslation(0.1, 0.1)
])
def load_and_preprocess(path, label):
image = tf.io.read_file(path)
image = tf.image.decode_jpeg(image, channels=3)
image = tf.image.resize(image, [224, 224])
image = tf.cast(image, tf.float32) / 255.0
return image, label
def load_preprocess_with_augment(path, label):
image, label = load_and_preprocess(path, label)
image = data_augmentation(image)
return image, label
# Training dataset
train_ds = tf.data.Dataset.from_tensor_slices((train_paths, train_labels)).shuffle(1000)
# Apply map function based on flag
if use_augmentation:
train_ds = train_ds.map(load_preprocess_with_augment, num_parallel_calls=tf.data.AUTOTUNE)
else:
train_ds = train_ds.map(load_and_preprocess, num_parallel_calls=tf.data.AUTOTUNE)
train_ds = train_ds.batch(batch_size).prefetch(tf.data.AUTOTUNE)
val_ds = tf.data.Dataset.from_tensor_slices((val_paths, val_labels)) \
.map(load_and_preprocess, num_parallel_calls=tf.data.AUTOTUNE) \
.batch(batch_size) \
.prefetch(tf.data.AUTOTUNE)
callbacks = [
EarlyStopping(
monitor='val_loss',
patience=30,
restore_best_weights=True,
verbose=1
),
ModelCheckpoint(
filepath='googlenet_finetuned_best.keras',
monitor='val_loss',
save_best_only=True,
verbose=1
),
ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=5, min_lr=1e-7, verbose=1)
]
train_class_indices = np.array([np.argmax(label) for label in df_train_googlenet["label_categorical"]]) #Get Class Indicies
# Compute class weights
class_weights_array = compute_class_weight(
class_weight='balanced',
classes=np.unique(train_class_indices),
y=train_class_indices
)
class_weights = dict(enumerate(class_weights_array)) #converting to dict
history = googlenet_model_tuned.fit(
train_ds,
validation_data=val_ds,
epochs=20
,callbacks=callbacks
,class_weight=class_weights #class imbalance
)
Epoch 1/20 WARNING:tensorflow:AutoGraph could not transform <function create_autocast_variable at 0x7fc0908d5510> and will run it as-is. Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: <gast.gast.Expr object at 0x7fbf39af5780> To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert WARNING: AutoGraph could not transform <function create_autocast_variable at 0x7fc0908d5510> and will run it as-is. Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: <gast.gast.Expr object at 0x7fbf39af5780> To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert 408/408 [==============================] - ETA: 0s - loss: 5.2974 - accuracy: 0.0037 - top_k_categorical_accuracy: 0.0196 Epoch 1: val_loss improved from inf to 5.28016, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 60s 58ms/step - loss: 5.2974 - accuracy: 0.0037 - top_k_categorical_accuracy: 0.0196 - val_loss: 5.2802 - val_accuracy: 0.0037 - val_top_k_categorical_accuracy: 0.0000e+00 - lr: 0.0010 Epoch 2/20 408/408 [==============================] - ETA: 0s - loss: 5.2798 - accuracy: 0.0020 - top_k_categorical_accuracy: 0.0000e+00 Epoch 2: val_loss improved from 5.28016 to 5.27986, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 15s 36ms/step - loss: 5.2798 - accuracy: 0.0020 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2799 - val_accuracy: 0.0043 - val_top_k_categorical_accuracy: 0.0000e+00 - lr: 0.0010 Epoch 3/20 405/408 [============================>.] - ETA: 0s - loss: 5.2788 - accuracy: 0.0032 - top_k_categorical_accuracy: 0.0000e+00 Epoch 3: val_loss improved from 5.27986 to 5.27976, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 15s 37ms/step - loss: 5.2798 - accuracy: 0.0032 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2798 - val_accuracy: 0.0043 - val_top_k_categorical_accuracy: 0.0000e+00 - lr: 0.0010 Epoch 4/20 406/408 [============================>.] - ETA: 0s - loss: 5.2796 - accuracy: 0.0023 - top_k_categorical_accuracy: 0.0000e+00 Epoch 4: val_loss improved from 5.27976 to 5.27975, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 15s 36ms/step - loss: 5.2798 - accuracy: 0.0023 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2797 - val_accuracy: 0.0043 - val_top_k_categorical_accuracy: 0.0000e+00 - lr: 0.0010 Epoch 5/20 405/408 [============================>.] - ETA: 0s - loss: 5.2796 - accuracy: 0.0035 - top_k_categorical_accuracy: 0.0000e+00 Epoch 5: val_loss did not improve from 5.27975 408/408 [==============================] - 13s 33ms/step - loss: 5.2798 - accuracy: 0.0035 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2798 - val_accuracy: 0.0043 - val_top_k_categorical_accuracy: 0.0000e+00 - lr: 0.0010 Epoch 6/20 407/408 [============================>.] - ETA: 0s - loss: 5.2797 - accuracy: 0.0031 - top_k_categorical_accuracy: 0.0000e+00 Epoch 6: val_loss improved from 5.27975 to 5.27961, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 15s 36ms/step - loss: 5.2798 - accuracy: 0.0031 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2796 - val_accuracy: 0.0049 - val_top_k_categorical_accuracy: 0.0000e+00 - lr: 0.0010 Epoch 7/20 406/408 [============================>.] - ETA: 0s - loss: 5.2806 - accuracy: 0.0026 - top_k_categorical_accuracy: 0.0000e+00 Epoch 7: val_loss improved from 5.27961 to 5.27946, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 15s 37ms/step - loss: 5.2798 - accuracy: 0.0026 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2795 - val_accuracy: 0.0049 - val_top_k_categorical_accuracy: 0.0000e+00 - lr: 0.0010 Epoch 8/20 408/408 [==============================] - ETA: 0s - loss: 5.2797 - accuracy: 0.0028 - top_k_categorical_accuracy: 0.0000e+00 Epoch 8: val_loss improved from 5.27946 to 5.27946, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 15s 36ms/step - loss: 5.2797 - accuracy: 0.0028 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2795 - val_accuracy: 0.0049 - val_top_k_categorical_accuracy: 0.0000e+00 - lr: 0.0010 Epoch 9/20 406/408 [============================>.] - ETA: 0s - loss: 5.2802 - accuracy: 0.0034 - top_k_categorical_accuracy: 0.0000e+00 Epoch 9: val_loss improved from 5.27946 to 5.27937, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 15s 36ms/step - loss: 5.2798 - accuracy: 0.0034 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2794 - val_accuracy: 0.0049 - val_top_k_categorical_accuracy: 0.0000e+00 - lr: 0.0010 Epoch 10/20 406/408 [============================>.] - ETA: 0s - loss: 5.2795 - accuracy: 0.0025 - top_k_categorical_accuracy: 0.0000e+00 Epoch 10: val_loss improved from 5.27937 to 5.27929, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 15s 37ms/step - loss: 5.2798 - accuracy: 0.0025 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2793 - val_accuracy: 0.0061 - val_top_k_categorical_accuracy: 0.0000e+00 - lr: 0.0010 Epoch 11/20 407/408 [============================>.] - ETA: 0s - loss: 5.2796 - accuracy: 0.0037 - top_k_categorical_accuracy: 0.0000e+00 Epoch 11: val_loss did not improve from 5.27929 408/408 [==============================] - 13s 33ms/step - loss: 5.2798 - accuracy: 0.0037 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2793 - val_accuracy: 0.0061 - val_top_k_categorical_accuracy: 0.0000e+00 - lr: 0.0010 Epoch 12/20 408/408 [==============================] - ETA: 0s - loss: 5.2798 - accuracy: 0.0032 - top_k_categorical_accuracy: 0.0000e+00 Epoch 12: val_loss improved from 5.27929 to 5.27921, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 15s 36ms/step - loss: 5.2798 - accuracy: 0.0032 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2792 - val_accuracy: 0.0055 - val_top_k_categorical_accuracy: 0.0000e+00 - lr: 0.0010 Epoch 13/20 408/408 [==============================] - ETA: 0s - loss: 5.2798 - accuracy: 0.0029 - top_k_categorical_accuracy: 0.0000e+00 Epoch 13: val_loss did not improve from 5.27921 408/408 [==============================] - 13s 33ms/step - loss: 5.2798 - accuracy: 0.0029 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2792 - val_accuracy: 0.0068 - val_top_k_categorical_accuracy: 0.0000e+00 - lr: 0.0010 Epoch 14/20 407/408 [============================>.] - ETA: 0s - loss: 5.2794 - accuracy: 0.0031 - top_k_categorical_accuracy: 0.0000e+00 Epoch 14: val_loss did not improve from 5.27921 408/408 [==============================] - 13s 33ms/step - loss: 5.2798 - accuracy: 0.0031 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2793 - val_accuracy: 0.0055 - val_top_k_categorical_accuracy: 0.0000e+00 - lr: 0.0010 Epoch 15/20 405/408 [============================>.] - ETA: 0s - loss: 5.2787 - accuracy: 0.0029 - top_k_categorical_accuracy: 0.0000e+00 Epoch 15: val_loss did not improve from 5.27921 Epoch 15: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257. 408/408 [==============================] - 13s 33ms/step - loss: 5.2798 - accuracy: 0.0029 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2792 - val_accuracy: 0.0068 - val_top_k_categorical_accuracy: 0.0000e+00 - lr: 0.0010 Epoch 16/20 405/408 [============================>.] - ETA: 0s - loss: 5.2779 - accuracy: 0.0039 - top_k_categorical_accuracy: 0.0000e+00 Epoch 16: val_loss did not improve from 5.27921 408/408 [==============================] - 13s 33ms/step - loss: 5.2789 - accuracy: 0.0038 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2793 - val_accuracy: 0.0068 - val_top_k_categorical_accuracy: 0.0000e+00 - lr: 5.0000e-04 Epoch 17/20 408/408 [==============================] - ETA: 0s - loss: 5.2795 - accuracy: 0.0029 - top_k_categorical_accuracy: 0.0000e+00 Epoch 17: val_loss improved from 5.27921 to 5.27892, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 15s 36ms/step - loss: 5.2795 - accuracy: 0.0029 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2789 - val_accuracy: 0.0049 - val_top_k_categorical_accuracy: 6.1387e-04 - lr: 5.0000e-04 Epoch 18/20 408/408 [==============================] - ETA: 0s - loss: 5.2790 - accuracy: 0.0037 - top_k_categorical_accuracy: 0.0000e+00 Epoch 18: val_loss did not improve from 5.27892 408/408 [==============================] - 13s 33ms/step - loss: 5.2790 - accuracy: 0.0037 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2789 - val_accuracy: 0.0049 - val_top_k_categorical_accuracy: 0.0012 - lr: 5.0000e-04 Epoch 19/20 405/408 [============================>.] - ETA: 0s - loss: 5.2795 - accuracy: 0.0029 - top_k_categorical_accuracy: 0.0000e+00 Epoch 19: val_loss improved from 5.27892 to 5.27891, saving model to googlenet_finetuned_best.keras 408/408 [==============================] - 15s 36ms/step - loss: 5.2790 - accuracy: 0.0031 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2789 - val_accuracy: 0.0049 - val_top_k_categorical_accuracy: 0.0012 - lr: 5.0000e-04 Epoch 20/20 407/408 [============================>.] - ETA: 0s - loss: 5.2785 - accuracy: 0.0034 - top_k_categorical_accuracy: 0.0000e+00 Epoch 20: val_loss did not improve from 5.27891 408/408 [==============================] - 13s 33ms/step - loss: 5.2790 - accuracy: 0.0034 - top_k_categorical_accuracy: 0.0000e+00 - val_loss: 5.2789 - val_accuracy: 0.0049 - val_top_k_categorical_accuracy: 6.1387e-04 - lr: 5.0000e-04 Restoring model weights from the end of the best epoch: 19.
Train Val Loss Graph
plot_training_history(history)
# Get all predictions and true labels
y_true = []
y_pred = []
for X_batch, y_batch in val_ds:
preds = googlenet_model_tuned.predict(X_batch)
y_pred_batch = np.argmax(preds, axis=1)
y_true_batch = y_batch.numpy() if hasattr(y_batch, "numpy") else y_batch
y_true.extend(y_true_batch)
y_pred.extend(y_pred_batch)
1/1 [==============================] - 7s 7s/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 32ms/step 1/1 [==============================] - 0s 29ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 169ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 29ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 31ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 31ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 26ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 28ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 30ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 27ms/step 1/1 [==============================] - 0s 26ms/step
target_names = label_encoder.classes_ if 'label_encoder' in globals() else None
report = classification_report(
y_true, y_pred,
target_names=target_names,
output_dict=True,
zero_division=1
)
df_googlenet_tuned_report = pd.DataFrame(report).transpose()
acc = accuracy_score(y_true, y_pred)
df_googlenet_tuned_report.loc["overall_accuracy"] = [acc, None, None, None]
df_googlenet_tuned_report.to_csv("googlenet_tuned_classification_report.csv")
print(f"Tuned GoogLeNet Accuracy: {acc:.4f}")
print("Average Summary Metrics:")
print(df_googlenet_tuned_report.tail(3)[["precision", "recall", "f1-score"]])
Tuned GoogLeNet Accuracy: 0.0049
Average Summary Metrics:
precision recall f1-score
macro avg 0.806229 0.003757 0.000194
weighted avg 0.802465 0.004911 0.000241
overall_accuracy 0.004911 NaN NaN
confusion matrix for tuned
cm = confusion_matrix(y_true, y_pred)
df_support = df_googlenet_tuned_report.iloc[:-3]
top_10_classes = df_support.sort_values("support", ascending=False).head(10).index.tolist()
# Get class indices (map from class name to index)
if target_names is not None:
top_10_indices = [np.where(target_names == cls)[0][0] for cls in top_10_classes]
else:
top_10_indices = list(map(int, top_10_classes)) # fallback if no class names
cm_top10 = cm[np.ix_(top_10_indices, top_10_indices)]
# Plot
plt.figure(figsize=(10, 8))
sns.heatmap(cm_top10, annot=True, fmt='d',
xticklabels=top_10_classes,
yticklabels=top_10_classes,
cmap='Blues')
plt.title("Tuned GoogLeNet - Confusion Matrix (Top 10 Classes)")
plt.xlabel("Predicted")
plt.ylabel("True")
plt.tight_layout()
plt.show()
for Classimbalance
#encoding labels
label_encoder = LabelEncoder()
df_training['labels_encoded'] = label_encoder.fit_transform(df_training['labels'])
df_training['labels'] = df_training['labels'].astype(str)
df_val['labels'] = df_val['labels'].astype(str)
# Calculate class weights (important for class imbalance)
class_weights = class_weight.compute_class_weight(
'balanced',
classes=np.unique(df_training['labels_encoded']),
y=df_training['labels_encoded']
)
class_weights = dict(enumerate(class_weights)) # Convert to dictionary
batch_size = 32
image_size = (224, 224)
train_datagen = ImageDataGenerator(
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest'
#,preprocessing_function=resnet_preprocess
)
train_generator = train_datagen.flow_from_dataframe(
df_training,
x_col='Image_Path',
y_col='labels',
target_size=image_size,
batch_size=batch_size,
class_mode='categorical'
)
Found 8144 validated image filenames belonging to 196 classes.
#val_datagen = ImageDataGenerator(preprocessing_function=resnet_preprocess)
val_datagen = ImageDataGenerator()
val_generator = val_datagen.flow_from_dataframe(
df_val,
x_col='Image_Path',
y_col='labels',
target_size=image_size,
batch_size=batch_size,
class_mode='categorical'
)
Found 1629 validated image filenames belonging to 196 classes.
model definition
# Load ResNet50 base model without the top layer
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
# Unfreeze some layers of the base model for fine-tuning (last 40 layers)
for layer in base_model.layers:
layer.trainable = False
for layer in base_model.layers[-40:]:
layer.trainable = True
num_classes = df_train['labels'].nunique()
# Add custom classification layers
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(num_classes, activation='softmax')(x) # Output layer
# Define model
resnet_tuned_model = Model(inputs=base_model.input, outputs=x)
# Compile the model again with a higher learning rate
resnet_tuned_model.compile(optimizer=Adam(learning_rate=1e-5), loss='categorical_crossentropy', metrics=['accuracy'])
resnet_tuned_model.summary()
Model: "model_4"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_5 (InputLayer) [(None, 224, 224, 3)] 0 []
conv1_pad (ZeroPadding2D) (None, 230, 230, 3) 0 ['input_5[0][0]']
conv1_conv (Conv2D) (None, 112, 112, 64) 9472 ['conv1_pad[0][0]']
conv1_bn (BatchNormalizati (None, 112, 112, 64) 256 ['conv1_conv[0][0]']
on)
conv1_relu (Activation) (None, 112, 112, 64) 0 ['conv1_bn[0][0]']
pool1_pad (ZeroPadding2D) (None, 114, 114, 64) 0 ['conv1_relu[0][0]']
pool1_pool (MaxPooling2D) (None, 56, 56, 64) 0 ['pool1_pad[0][0]']
conv2_block1_1_conv (Conv2 (None, 56, 56, 64) 4160 ['pool1_pool[0][0]']
D)
conv2_block1_1_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block1_1_conv[0][0]']
rmalization)
conv2_block1_1_relu (Activ (None, 56, 56, 64) 0 ['conv2_block1_1_bn[0][0]']
ation)
conv2_block1_2_conv (Conv2 (None, 56, 56, 64) 36928 ['conv2_block1_1_relu[0][0]']
D)
conv2_block1_2_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block1_2_conv[0][0]']
rmalization)
conv2_block1_2_relu (Activ (None, 56, 56, 64) 0 ['conv2_block1_2_bn[0][0]']
ation)
conv2_block1_0_conv (Conv2 (None, 56, 56, 256) 16640 ['pool1_pool[0][0]']
D)
conv2_block1_3_conv (Conv2 (None, 56, 56, 256) 16640 ['conv2_block1_2_relu[0][0]']
D)
conv2_block1_0_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block1_0_conv[0][0]']
rmalization)
conv2_block1_3_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block1_3_conv[0][0]']
rmalization)
conv2_block1_add (Add) (None, 56, 56, 256) 0 ['conv2_block1_0_bn[0][0]',
'conv2_block1_3_bn[0][0]']
conv2_block1_out (Activati (None, 56, 56, 256) 0 ['conv2_block1_add[0][0]']
on)
conv2_block2_1_conv (Conv2 (None, 56, 56, 64) 16448 ['conv2_block1_out[0][0]']
D)
conv2_block2_1_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block2_1_conv[0][0]']
rmalization)
conv2_block2_1_relu (Activ (None, 56, 56, 64) 0 ['conv2_block2_1_bn[0][0]']
ation)
conv2_block2_2_conv (Conv2 (None, 56, 56, 64) 36928 ['conv2_block2_1_relu[0][0]']
D)
conv2_block2_2_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block2_2_conv[0][0]']
rmalization)
conv2_block2_2_relu (Activ (None, 56, 56, 64) 0 ['conv2_block2_2_bn[0][0]']
ation)
conv2_block2_3_conv (Conv2 (None, 56, 56, 256) 16640 ['conv2_block2_2_relu[0][0]']
D)
conv2_block2_3_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block2_3_conv[0][0]']
rmalization)
conv2_block2_add (Add) (None, 56, 56, 256) 0 ['conv2_block1_out[0][0]',
'conv2_block2_3_bn[0][0]']
conv2_block2_out (Activati (None, 56, 56, 256) 0 ['conv2_block2_add[0][0]']
on)
conv2_block3_1_conv (Conv2 (None, 56, 56, 64) 16448 ['conv2_block2_out[0][0]']
D)
conv2_block3_1_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block3_1_conv[0][0]']
rmalization)
conv2_block3_1_relu (Activ (None, 56, 56, 64) 0 ['conv2_block3_1_bn[0][0]']
ation)
conv2_block3_2_conv (Conv2 (None, 56, 56, 64) 36928 ['conv2_block3_1_relu[0][0]']
D)
conv2_block3_2_bn (BatchNo (None, 56, 56, 64) 256 ['conv2_block3_2_conv[0][0]']
rmalization)
conv2_block3_2_relu (Activ (None, 56, 56, 64) 0 ['conv2_block3_2_bn[0][0]']
ation)
conv2_block3_3_conv (Conv2 (None, 56, 56, 256) 16640 ['conv2_block3_2_relu[0][0]']
D)
conv2_block3_3_bn (BatchNo (None, 56, 56, 256) 1024 ['conv2_block3_3_conv[0][0]']
rmalization)
conv2_block3_add (Add) (None, 56, 56, 256) 0 ['conv2_block2_out[0][0]',
'conv2_block3_3_bn[0][0]']
conv2_block3_out (Activati (None, 56, 56, 256) 0 ['conv2_block3_add[0][0]']
on)
conv3_block1_1_conv (Conv2 (None, 28, 28, 128) 32896 ['conv2_block3_out[0][0]']
D)
conv3_block1_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block1_1_conv[0][0]']
rmalization)
conv3_block1_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block1_1_bn[0][0]']
ation)
conv3_block1_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block1_1_relu[0][0]']
D)
conv3_block1_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block1_2_conv[0][0]']
rmalization)
conv3_block1_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block1_2_bn[0][0]']
ation)
conv3_block1_0_conv (Conv2 (None, 28, 28, 512) 131584 ['conv2_block3_out[0][0]']
D)
conv3_block1_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block1_2_relu[0][0]']
D)
conv3_block1_0_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block1_0_conv[0][0]']
rmalization)
conv3_block1_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block1_3_conv[0][0]']
rmalization)
conv3_block1_add (Add) (None, 28, 28, 512) 0 ['conv3_block1_0_bn[0][0]',
'conv3_block1_3_bn[0][0]']
conv3_block1_out (Activati (None, 28, 28, 512) 0 ['conv3_block1_add[0][0]']
on)
conv3_block2_1_conv (Conv2 (None, 28, 28, 128) 65664 ['conv3_block1_out[0][0]']
D)
conv3_block2_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block2_1_conv[0][0]']
rmalization)
conv3_block2_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block2_1_bn[0][0]']
ation)
conv3_block2_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block2_1_relu[0][0]']
D)
conv3_block2_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block2_2_conv[0][0]']
rmalization)
conv3_block2_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block2_2_bn[0][0]']
ation)
conv3_block2_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block2_2_relu[0][0]']
D)
conv3_block2_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block2_3_conv[0][0]']
rmalization)
conv3_block2_add (Add) (None, 28, 28, 512) 0 ['conv3_block1_out[0][0]',
'conv3_block2_3_bn[0][0]']
conv3_block2_out (Activati (None, 28, 28, 512) 0 ['conv3_block2_add[0][0]']
on)
conv3_block3_1_conv (Conv2 (None, 28, 28, 128) 65664 ['conv3_block2_out[0][0]']
D)
conv3_block3_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block3_1_conv[0][0]']
rmalization)
conv3_block3_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block3_1_bn[0][0]']
ation)
conv3_block3_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block3_1_relu[0][0]']
D)
conv3_block3_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block3_2_conv[0][0]']
rmalization)
conv3_block3_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block3_2_bn[0][0]']
ation)
conv3_block3_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block3_2_relu[0][0]']
D)
conv3_block3_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block3_3_conv[0][0]']
rmalization)
conv3_block3_add (Add) (None, 28, 28, 512) 0 ['conv3_block2_out[0][0]',
'conv3_block3_3_bn[0][0]']
conv3_block3_out (Activati (None, 28, 28, 512) 0 ['conv3_block3_add[0][0]']
on)
conv3_block4_1_conv (Conv2 (None, 28, 28, 128) 65664 ['conv3_block3_out[0][0]']
D)
conv3_block4_1_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block4_1_conv[0][0]']
rmalization)
conv3_block4_1_relu (Activ (None, 28, 28, 128) 0 ['conv3_block4_1_bn[0][0]']
ation)
conv3_block4_2_conv (Conv2 (None, 28, 28, 128) 147584 ['conv3_block4_1_relu[0][0]']
D)
conv3_block4_2_bn (BatchNo (None, 28, 28, 128) 512 ['conv3_block4_2_conv[0][0]']
rmalization)
conv3_block4_2_relu (Activ (None, 28, 28, 128) 0 ['conv3_block4_2_bn[0][0]']
ation)
conv3_block4_3_conv (Conv2 (None, 28, 28, 512) 66048 ['conv3_block4_2_relu[0][0]']
D)
conv3_block4_3_bn (BatchNo (None, 28, 28, 512) 2048 ['conv3_block4_3_conv[0][0]']
rmalization)
conv3_block4_add (Add) (None, 28, 28, 512) 0 ['conv3_block3_out[0][0]',
'conv3_block4_3_bn[0][0]']
conv3_block4_out (Activati (None, 28, 28, 512) 0 ['conv3_block4_add[0][0]']
on)
conv4_block1_1_conv (Conv2 (None, 14, 14, 256) 131328 ['conv3_block4_out[0][0]']
D)
conv4_block1_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block1_1_conv[0][0]']
rmalization)
conv4_block1_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block1_1_bn[0][0]']
ation)
conv4_block1_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block1_1_relu[0][0]']
D)
conv4_block1_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block1_2_conv[0][0]']
rmalization)
conv4_block1_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block1_2_bn[0][0]']
ation)
conv4_block1_0_conv (Conv2 (None, 14, 14, 1024) 525312 ['conv3_block4_out[0][0]']
D)
conv4_block1_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block1_2_relu[0][0]']
D)
conv4_block1_0_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block1_0_conv[0][0]']
rmalization)
conv4_block1_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block1_3_conv[0][0]']
rmalization)
conv4_block1_add (Add) (None, 14, 14, 1024) 0 ['conv4_block1_0_bn[0][0]',
'conv4_block1_3_bn[0][0]']
conv4_block1_out (Activati (None, 14, 14, 1024) 0 ['conv4_block1_add[0][0]']
on)
conv4_block2_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block1_out[0][0]']
D)
conv4_block2_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block2_1_conv[0][0]']
rmalization)
conv4_block2_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block2_1_bn[0][0]']
ation)
conv4_block2_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block2_1_relu[0][0]']
D)
conv4_block2_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block2_2_conv[0][0]']
rmalization)
conv4_block2_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block2_2_bn[0][0]']
ation)
conv4_block2_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block2_2_relu[0][0]']
D)
conv4_block2_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block2_3_conv[0][0]']
rmalization)
conv4_block2_add (Add) (None, 14, 14, 1024) 0 ['conv4_block1_out[0][0]',
'conv4_block2_3_bn[0][0]']
conv4_block2_out (Activati (None, 14, 14, 1024) 0 ['conv4_block2_add[0][0]']
on)
conv4_block3_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block2_out[0][0]']
D)
conv4_block3_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block3_1_conv[0][0]']
rmalization)
conv4_block3_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block3_1_bn[0][0]']
ation)
conv4_block3_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block3_1_relu[0][0]']
D)
conv4_block3_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block3_2_conv[0][0]']
rmalization)
conv4_block3_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block3_2_bn[0][0]']
ation)
conv4_block3_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block3_2_relu[0][0]']
D)
conv4_block3_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block3_3_conv[0][0]']
rmalization)
conv4_block3_add (Add) (None, 14, 14, 1024) 0 ['conv4_block2_out[0][0]',
'conv4_block3_3_bn[0][0]']
conv4_block3_out (Activati (None, 14, 14, 1024) 0 ['conv4_block3_add[0][0]']
on)
conv4_block4_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block3_out[0][0]']
D)
conv4_block4_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block4_1_conv[0][0]']
rmalization)
conv4_block4_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block4_1_bn[0][0]']
ation)
conv4_block4_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block4_1_relu[0][0]']
D)
conv4_block4_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block4_2_conv[0][0]']
rmalization)
conv4_block4_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block4_2_bn[0][0]']
ation)
conv4_block4_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block4_2_relu[0][0]']
D)
conv4_block4_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block4_3_conv[0][0]']
rmalization)
conv4_block4_add (Add) (None, 14, 14, 1024) 0 ['conv4_block3_out[0][0]',
'conv4_block4_3_bn[0][0]']
conv4_block4_out (Activati (None, 14, 14, 1024) 0 ['conv4_block4_add[0][0]']
on)
conv4_block5_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block4_out[0][0]']
D)
conv4_block5_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block5_1_conv[0][0]']
rmalization)
conv4_block5_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block5_1_bn[0][0]']
ation)
conv4_block5_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block5_1_relu[0][0]']
D)
conv4_block5_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block5_2_conv[0][0]']
rmalization)
conv4_block5_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block5_2_bn[0][0]']
ation)
conv4_block5_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block5_2_relu[0][0]']
D)
conv4_block5_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block5_3_conv[0][0]']
rmalization)
conv4_block5_add (Add) (None, 14, 14, 1024) 0 ['conv4_block4_out[0][0]',
'conv4_block5_3_bn[0][0]']
conv4_block5_out (Activati (None, 14, 14, 1024) 0 ['conv4_block5_add[0][0]']
on)
conv4_block6_1_conv (Conv2 (None, 14, 14, 256) 262400 ['conv4_block5_out[0][0]']
D)
conv4_block6_1_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block6_1_conv[0][0]']
rmalization)
conv4_block6_1_relu (Activ (None, 14, 14, 256) 0 ['conv4_block6_1_bn[0][0]']
ation)
conv4_block6_2_conv (Conv2 (None, 14, 14, 256) 590080 ['conv4_block6_1_relu[0][0]']
D)
conv4_block6_2_bn (BatchNo (None, 14, 14, 256) 1024 ['conv4_block6_2_conv[0][0]']
rmalization)
conv4_block6_2_relu (Activ (None, 14, 14, 256) 0 ['conv4_block6_2_bn[0][0]']
ation)
conv4_block6_3_conv (Conv2 (None, 14, 14, 1024) 263168 ['conv4_block6_2_relu[0][0]']
D)
conv4_block6_3_bn (BatchNo (None, 14, 14, 1024) 4096 ['conv4_block6_3_conv[0][0]']
rmalization)
conv4_block6_add (Add) (None, 14, 14, 1024) 0 ['conv4_block5_out[0][0]',
'conv4_block6_3_bn[0][0]']
conv4_block6_out (Activati (None, 14, 14, 1024) 0 ['conv4_block6_add[0][0]']
on)
conv5_block1_1_conv (Conv2 (None, 7, 7, 512) 524800 ['conv4_block6_out[0][0]']
D)
conv5_block1_1_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block1_1_conv[0][0]']
rmalization)
conv5_block1_1_relu (Activ (None, 7, 7, 512) 0 ['conv5_block1_1_bn[0][0]']
ation)
conv5_block1_2_conv (Conv2 (None, 7, 7, 512) 2359808 ['conv5_block1_1_relu[0][0]']
D)
conv5_block1_2_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block1_2_conv[0][0]']
rmalization)
conv5_block1_2_relu (Activ (None, 7, 7, 512) 0 ['conv5_block1_2_bn[0][0]']
ation)
conv5_block1_0_conv (Conv2 (None, 7, 7, 2048) 2099200 ['conv4_block6_out[0][0]']
D)
conv5_block1_3_conv (Conv2 (None, 7, 7, 2048) 1050624 ['conv5_block1_2_relu[0][0]']
D)
conv5_block1_0_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block1_0_conv[0][0]']
rmalization)
conv5_block1_3_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block1_3_conv[0][0]']
rmalization)
conv5_block1_add (Add) (None, 7, 7, 2048) 0 ['conv5_block1_0_bn[0][0]',
'conv5_block1_3_bn[0][0]']
conv5_block1_out (Activati (None, 7, 7, 2048) 0 ['conv5_block1_add[0][0]']
on)
conv5_block2_1_conv (Conv2 (None, 7, 7, 512) 1049088 ['conv5_block1_out[0][0]']
D)
conv5_block2_1_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block2_1_conv[0][0]']
rmalization)
conv5_block2_1_relu (Activ (None, 7, 7, 512) 0 ['conv5_block2_1_bn[0][0]']
ation)
conv5_block2_2_conv (Conv2 (None, 7, 7, 512) 2359808 ['conv5_block2_1_relu[0][0]']
D)
conv5_block2_2_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block2_2_conv[0][0]']
rmalization)
conv5_block2_2_relu (Activ (None, 7, 7, 512) 0 ['conv5_block2_2_bn[0][0]']
ation)
conv5_block2_3_conv (Conv2 (None, 7, 7, 2048) 1050624 ['conv5_block2_2_relu[0][0]']
D)
conv5_block2_3_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block2_3_conv[0][0]']
rmalization)
conv5_block2_add (Add) (None, 7, 7, 2048) 0 ['conv5_block1_out[0][0]',
'conv5_block2_3_bn[0][0]']
conv5_block2_out (Activati (None, 7, 7, 2048) 0 ['conv5_block2_add[0][0]']
on)
conv5_block3_1_conv (Conv2 (None, 7, 7, 512) 1049088 ['conv5_block2_out[0][0]']
D)
conv5_block3_1_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block3_1_conv[0][0]']
rmalization)
conv5_block3_1_relu (Activ (None, 7, 7, 512) 0 ['conv5_block3_1_bn[0][0]']
ation)
conv5_block3_2_conv (Conv2 (None, 7, 7, 512) 2359808 ['conv5_block3_1_relu[0][0]']
D)
conv5_block3_2_bn (BatchNo (None, 7, 7, 512) 2048 ['conv5_block3_2_conv[0][0]']
rmalization)
conv5_block3_2_relu (Activ (None, 7, 7, 512) 0 ['conv5_block3_2_bn[0][0]']
ation)
conv5_block3_3_conv (Conv2 (None, 7, 7, 2048) 1050624 ['conv5_block3_2_relu[0][0]']
D)
conv5_block3_3_bn (BatchNo (None, 7, 7, 2048) 8192 ['conv5_block3_3_conv[0][0]']
rmalization)
conv5_block3_add (Add) (None, 7, 7, 2048) 0 ['conv5_block2_out[0][0]',
'conv5_block3_3_bn[0][0]']
conv5_block3_out (Activati (None, 7, 7, 2048) 0 ['conv5_block3_add[0][0]']
on)
global_average_pooling2d_4 (None, 2048) 0 ['conv5_block3_out[0][0]']
(GlobalAveragePooling2D)
dense_11 (Dense) (None, 512) 1049088 ['global_average_pooling2d_4[0
][0]']
dropout_5 (Dropout) (None, 512) 0 ['dense_11[0][0]']
dense_12 (Dense) (None, 196) 100548 ['dropout_5[0][0]']
==================================================================================================
Total params: 24737348 (94.37 MB)
Trainable params: 16981444 (64.78 MB)
Non-trainable params: 7755904 (29.59 MB)
__________________________________________________________________________________________________
Data Generation
# Custom data generator
""" def custom_data_generator(generator, class_weights):
while True:
x, y = next(generator)
sample_weights = np.array([class_weights[np.argmax(label)] for label in y])
yield x, y, sample_weights """
def custom_data_generator(generator, class_weights):
while True:
x, y = next(generator)
sample_weights = np.array([class_weights[label] for label in np.argmax(y, axis=1)])
yield x, y, sample_weights
# Define data augmentation
#train_datagen = ImageDataGenerator(
# rotation_range=20,
# width_shift_range=0.2,
# height_shift_range=0.2,
# shear_range=0.2,
# zoom_range=0.2,
# horizontal_flip=True,
# fill_mode='nearest',
# preprocessing_function=preprocess_input # Use the ResNet50 preprocessing function
#)
# Load data using flow_from_dataframe
#print(df_train.columns) # Check the columns in the DataFrame
# Convert labels to string format if they are not already
#df_train['labels'] = df_train['labels'].astype(str)
# Create training data generator
#train_generator = train_datagen.flow_from_dataframe(
# df_train,
# x_col='Image_Path',
# y_col='labels', # Now it should be in the correct format
# target_size=(224, 224),
# batch_size=batch_size,
# class_mode='categorical'
#)
# Create validation data generator (assuming df_val is defined)
#val_datagen = ImageDataGenerator(preprocessing_function=preprocess_input) # No augmentation for validation
#val_generator = val_datagen.flow_from_dataframe(
# df_val,
# x_col='Image_Path',
# y_col='labels',
# target_size=(224, 224),
# batch_size=batch_size,
# class_mode='categorical'
#)
# Train the model for a few epochs
#steps_per_epoch = len(df_training) // batch_size
#validation_steps = len(df_val) // batch_size
steps_per_epoch = np.ceil(len(df_training) / batch_size).astype(int)
validation_steps = np.ceil(len(df_val) / batch_size).astype(int)
# Define number of fine-tuning epochs
fine_tune_epochs = 20
# Define callbacks
early_stopping = EarlyStopping(monitor='val_loss', patience=5, restore_best_weights=True)
model_checkpoint = ModelCheckpoint('best_model_tuned_resnet.keras', save_best_only=True, monitor='val_loss')
# Train the model with callbacks and custom data generator
#resnet_history_fine_tune = resnet_model.fit(
# custom_data_generator(train_generator, class_weights),
# steps_per_epoch=steps_per_epoch,
# validation_data=val_generator,
# validation_steps=validation_steps,
# epochs=fine_tune_epochs,
# callbacks=[early_stopping, model_checkpoint]
#)
resnet_history_fine_tune = resnet_tuned_model.fit(
train_generator,
steps_per_epoch=steps_per_epoch,
validation_data=val_generator,
validation_steps=validation_steps,
epochs=20,
callbacks=[early_stopping, model_checkpoint]
,class_weight=class_weights
)
Epoch 1/20 255/255 [==============================] - 102s 343ms/step - loss: 5.5361 - accuracy: 0.0060 - val_loss: 5.2760 - val_accuracy: 0.0068 Epoch 2/20 255/255 [==============================] - 85s 334ms/step - loss: 5.3247 - accuracy: 0.0088 - val_loss: 5.1594 - val_accuracy: 0.0141 Epoch 3/20 255/255 [==============================] - 86s 335ms/step - loss: 5.2186 - accuracy: 0.0140 - val_loss: 5.0490 - val_accuracy: 0.0325 Epoch 4/20 255/255 [==============================] - 86s 336ms/step - loss: 5.1338 - accuracy: 0.0195 - val_loss: 4.9000 - val_accuracy: 0.0552 Epoch 5/20 255/255 [==============================] - 86s 337ms/step - loss: 5.0256 - accuracy: 0.0309 - val_loss: 4.6958 - val_accuracy: 0.0884 Epoch 6/20 255/255 [==============================] - 86s 336ms/step - loss: 4.8848 - accuracy: 0.0500 - val_loss: 4.4740 - val_accuracy: 0.1289 Epoch 7/20 255/255 [==============================] - 85s 335ms/step - loss: 4.7340 - accuracy: 0.0640 - val_loss: 4.2258 - val_accuracy: 0.1805 Epoch 8/20 255/255 [==============================] - 85s 334ms/step - loss: 4.5554 - accuracy: 0.0866 - val_loss: 3.9657 - val_accuracy: 0.2192 Epoch 9/20 255/255 [==============================] - 85s 334ms/step - loss: 4.3864 - accuracy: 0.1055 - val_loss: 3.7013 - val_accuracy: 0.2830 Epoch 10/20 255/255 [==============================] - 85s 335ms/step - loss: 4.2068 - accuracy: 0.1279 - val_loss: 3.4504 - val_accuracy: 0.3254 Epoch 11/20 255/255 [==============================] - 86s 336ms/step - loss: 4.0005 - accuracy: 0.1566 - val_loss: 3.2381 - val_accuracy: 0.3616 Epoch 12/20 255/255 [==============================] - 86s 336ms/step - loss: 3.8427 - accuracy: 0.1804 - val_loss: 3.0073 - val_accuracy: 0.4070 Epoch 13/20 255/255 [==============================] - 86s 336ms/step - loss: 3.6528 - accuracy: 0.2086 - val_loss: 2.7480 - val_accuracy: 0.4573 Epoch 14/20 255/255 [==============================] - 86s 336ms/step - loss: 3.4740 - accuracy: 0.2388 - val_loss: 2.5650 - val_accuracy: 0.4923 Epoch 15/20 255/255 [==============================] - 86s 337ms/step - loss: 3.3342 - accuracy: 0.2574 - val_loss: 2.3789 - val_accuracy: 0.5396 Epoch 16/20 255/255 [==============================] - 86s 336ms/step - loss: 3.1701 - accuracy: 0.2883 - val_loss: 2.1910 - val_accuracy: 0.5801 Epoch 17/20 255/255 [==============================] - 86s 338ms/step - loss: 3.0306 - accuracy: 0.3140 - val_loss: 2.0447 - val_accuracy: 0.6102 Epoch 18/20 255/255 [==============================] - 85s 335ms/step - loss: 2.8898 - accuracy: 0.3434 - val_loss: 1.8901 - val_accuracy: 0.6341 Epoch 19/20 255/255 [==============================] - 86s 337ms/step - loss: 2.7429 - accuracy: 0.3662 - val_loss: 1.7542 - val_accuracy: 0.6568 Epoch 20/20 255/255 [==============================] - 86s 335ms/step - loss: 2.6498 - accuracy: 0.3849 - val_loss: 1.6486 - val_accuracy: 0.6783
plot_training_history(resnet_history_fine_tune)
num_samples = len(df_val)
X_val = np.array(df_val['image'].tolist()).astype(np.float32) # Keep as list to avoid memory burst
y_val_true = np.array([np.argmax(label) for label in df_val['label_categorical']])
y_val_pred = []
for i in range(0, num_samples, batch_size):
batch_imgs = X_val[i:i+batch_size]
preds = resnet_tuned_model.predict(batch_imgs, verbose=0)
batch_preds = np.argmax(preds, axis=1)
y_val_pred.extend(batch_preds)
y_val_pred = np.array(y_val_pred)
print("X_val shape:", X_val.shape)
print("X_val dtype:", X_val.dtype)
X_val shape: (1629, 224, 224, 3) X_val dtype: float32
target_names = label_encoder.classes_ if 'label_encoder' in globals() else None
resnet_tuned_report = classification_report(
y_val_true, y_val_pred,
target_names=target_names,
output_dict=True,
zero_division=1 # Avoid divide-by-zero errors
)
#print("Unique y_true:", np.unique(y_val_true))
#print("Unique y_pred:", np.unique(y_val_pred))
#unique_preds, counts = np.unique(y_val_pred, return_counts=True)
#print("Predicted class distribution:", dict(zip(unique_preds, counts)))
df_resnet_tuned_report = pd.DataFrame(resnet_tuned_report).transpose()
acc = accuracy_score(y_val_true, y_val_pred)
df_resnet_tuned_report.loc["overall_accuracy"] = [acc, None, None, None]
df_resnet_tuned_report.to_csv("resnet_tuned_classification_report.csv")
print(f"Tuned ResNet Accuracy: {acc:.4f}")
print("Average Resnet Summary Metrics:")
print(df_resnet_tuned_report.tail(3)[["precision", "recall", "f1-score"]])
Tuned ResNet Accuracy: 0.0043
Average Resnet Summary Metrics:
precision recall f1-score
macro avg 0.984716 0.005102 0.000044
weighted avg 0.986513 0.004297 0.000037
overall_accuracy 0.004297 NaN NaN
cm = confusion_matrix(y_true, y_pred)
df_support = df_resnet_tuned_report.iloc[:-3]
top_10_classes = df_support.sort_values("support", ascending=False).head(10).index.tolist()
if target_names is not None:
top_10_indices = [np.where(target_names == cls)[0][0] for cls in top_10_classes]
else:
top_10_indices = list(map(int, top_10_classes)) # fallback if no class names
cm_top10 = cm[np.ix_(top_10_indices, top_10_indices)]
plt.figure(figsize=(10, 8))
sns.heatmap(cm_top10, annot=True, fmt='d',
xticklabels=top_10_classes,
yticklabels=top_10_classes,
cmap='Blues')
plt.title("Tuned GoogLeNet - Confusion Matrix (Top 10 Classes)")
plt.xlabel("Predicted")
plt.ylabel("True")
plt.tight_layout()
plt.show()
# First, add a column to identify each model
df_resnet_classification_report_tail = df_resnet_classification_report.tail(4).copy()
df_resnet_classification_report_tail['Model'] = 'ResNet Untuned (10 Epochs)'
df_resnet_tuned_report_tail = df_resnet_tuned_report.tail(4).copy()
df_resnet_tuned_report_tail['Model'] = 'ResNet Tuned (20 Epochs)'
df_googlenet_classification_report_tail = df_googlenet_classification_report.tail(4).copy()
df_googlenet_classification_report_tail['Model'] = 'GoogLeNet Untuned (10 Epochs)'
df_googlenet_tuned_report_tail = df_googlenet_tuned_report.tail(4).copy()
df_googlenet_tuned_report_tail['Model'] = 'GoogLeNet Tuned (20 Epochs)'
df_combined_tail = pd.concat([
df_resnet_classification_report_tail,
df_resnet_tuned_report_tail,
df_googlenet_classification_report_tail,
df_googlenet_tuned_report_tail
])
df_combined_tail = df_combined_tail.reset_index().rename(columns={'index': 'Metric'})
df_combined_tail = df_combined_tail[['Model', 'Metric', 'precision', 'recall', 'f1-score']]
df_combined_tail.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 16 entries, 0 to 15 Data columns (total 5 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Model 16 non-null object 1 Metric 16 non-null object 2 precision 16 non-null float64 3 recall 12 non-null float64 4 f1-score 12 non-null float64 dtypes: float64(3), object(2) memory usage: 768.0+ bytes
df_combined_tail
| Model | Metric | precision | recall | f1-score | |
|---|---|---|---|---|---|
| 0 | ResNet Untuned (10 Epochs) | accuracy | 0.005525 | 0.005525 | 0.005525 |
| 1 | ResNet Untuned (10 Epochs) | macro avg | 0.933837 | 0.006013 | 0.000318 |
| 2 | ResNet Untuned (10 Epochs) | weighted avg | 0.946754 | 0.005525 | 0.000312 |
| 3 | ResNet Untuned (10 Epochs) | overall_accuracy | 0.005525 | NaN | NaN |
| 4 | ResNet Tuned (20 Epochs) | accuracy | 0.004297 | 0.004297 | 0.004297 |
| 5 | ResNet Tuned (20 Epochs) | macro avg | 0.984716 | 0.005102 | 0.000044 |
| 6 | ResNet Tuned (20 Epochs) | weighted avg | 0.986513 | 0.004297 | 0.000037 |
| 7 | ResNet Tuned (20 Epochs) | overall_accuracy | 0.004297 | NaN | NaN |
| 8 | GoogLeNet Untuned (10 Epochs) | accuracy | 0.228975 | 0.228975 | 0.228975 |
| 9 | GoogLeNet Untuned (10 Epochs) | macro avg | 0.404701 | 0.227161 | 0.205622 |
| 10 | GoogLeNet Untuned (10 Epochs) | weighted avg | 0.423114 | 0.228975 | 0.217884 |
| 11 | GoogLeNet Untuned (10 Epochs) | overall_accuracy | 0.228975 | NaN | NaN |
| 12 | GoogLeNet Tuned (20 Epochs) | accuracy | 0.004911 | 0.004911 | 0.004911 |
| 13 | GoogLeNet Tuned (20 Epochs) | macro avg | 0.806229 | 0.003757 | 0.000194 |
| 14 | GoogLeNet Tuned (20 Epochs) | weighted avg | 0.802465 | 0.004911 | 0.000241 |
| 15 | GoogLeNet Tuned (20 Epochs) | overall_accuracy | 0.004911 | NaN | NaN |
hence the Final Model Selected for Test Evaluation:GoogLeNet Untuned Version with 10 Epohs
batch_size=16
# Step 1: Get true labels
y_test_true = np.array([np.argmax(label) for label in df_testing['label_categorical']])
# Step 2: Predict in batches
y_test_pred = []
for i in range(0, len(df_testing), batch_size):
batch_imgs = np.array(df_testing['image'].tolist()[i:i+batch_size])
preds = googlenet_model.predict(batch_imgs, verbose=0)
batch_preds = np.argmax(preds, axis=1)
y_test_pred.extend(batch_preds)
y_test_pred = np.array(y_test_pred)
target_names = label_encoder.classes_ if 'label_encoder' in globals() else None
final_googlenet_untuned_report = classification_report(
y_test_true, y_test_pred,
target_names=target_names,
output_dict=True,
zero_division=1 # Avoid divide-by-zero errors
)
df_final_googlenet_untuned_report = pd.DataFrame(final_googlenet_untuned_report).transpose()
acc = accuracy_score(y_test_true, y_test_pred)
df_final_googlenet_untuned_report.loc["overall_accuracy"]= [acc, None, None, None]
df_final_googlenet_untuned_report.to_csv("df_final_googlenet_untuned_classification_report.csv")
print(f"Final GoogleNet(Untuned) against test data Accuracy: {acc:.4f}")
print("Final Untuned GoogleNet metrics against test:")
print(df_final_googlenet_untuned_report.tail(3)[["precision", "recall", "f1-score"]])
Final GoogleNet(Untuned) against test data Accuracy: 0.2194
Final Untuned GoogleNet metrics against test:
precision recall f1-score
macro avg 0.350745 0.219166 0.204758
weighted avg 0.349543 0.219376 0.205623
overall_accuracy 0.219376 NaN NaN
cm = confusion_matrix(y_true, y_pred)
df_support = df_final_googlenet_untuned_report.iloc[:-3]
top_10_classes = df_support.sort_values("support", ascending=False).head(10).index.tolist()
if target_names is not None:
top_10_indices = [np.where(target_names == cls)[0][0] for cls in top_10_classes]
else:
top_10_indices = list(map(int, top_10_classes)) # fallback if no class names
cm_top10 = cm[np.ix_(top_10_indices, top_10_indices)]
plt.figure(figsize=(10, 8))
sns.heatmap(cm_top10, annot=True, fmt='d',
xticklabels=top_10_classes,
yticklabels=top_10_classes,
cmap='Blues')
plt.title("Tuned GoogLeNet - Confusion Matrix (Top 10 Classes)")
plt.xlabel("Predicted")
plt.ylabel("True")
plt.tight_layout()
plt.show()